Fast and intuitive clustering of web documents 论文

1997引用 267

Algorithms and Data CompressionWeb Data Mining and AnalysisData Mining Algorithms and Applications

Algorithms and Data Compression Data Mining Algorithms and Applications Web Data Mining and Analysis

作者

摘要

Conventional document retrieval systems (e.g., Alta Vista) return long lists of ranked documents in response to user queries. Recently, document clustering has been put forth as an alternative method of organizing the results of a retrieval [4]. A person browsing the clusters can discover patterns that would be overlooked in the traditional ranked-list presentation. In this context, a document clustering algorithm has two key requirements. First, the algorithm ought to produce clusters that are easy-to-browse -- a user needs to determine at a glance whether the contents of a cluster are of interest. Second, the algorithm has to be fast even when applied to thousands of documents with no preprocessing. This paper describes several novel clustering methods, which intersect the documents in a cluster to determine the set of words (or phrases) shared by all the documents in the cluster. We report on experiments that evaluate these intersection-based clustering methods on collections of sn...

作者查看全部 (4)

Oren Etzioni

Richard M. Karp

Omid Madani

Oren Zamir

Fast and intuitive clustering of web documents 论文

摘要

作者查看全部 (4)

相关技术查看全部 (2)

相关事件

相关文章