Template detection via data mining and its applications 论文

2002引用 271

Web Data Mining and AnalysisData Mining Algorithms and ApplicationsText and Document Classification Technologies

Text and Document Classification Technologies Data Mining Algorithms and Applications Web Data Mining and Analysis

作者

摘要

We formulate and propose the template detection problem, and suggest a practical solution for it based on counting frequent item sets. We show that the use of templates is pervasive on the web. We describe three principles, which characterize the assumptions made by hypertext information retrieval (IR) and data mining (DM) systems, and show that templates are a major source of violation of these principles. As a consequence, basic "pure" implementations of simple search algorithms coupled with template detection and elimination show surprising increases in precision at all levels of recall.

作者查看全部 (2)

Sridhar Rajagopalan

Ziv Bar-Yossef

Template detection via data mining and its applications 论文

摘要

作者查看全部 (2)

相关技术查看全部 (2)

相关事件

相关文章