Incremental clustering and dynamic information retrieval 论文

1997引用 358
Data Management and AlgorithmsCaching and Content DeliveryAdvanced Clustering Algorithms Research

摘要

Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model-c~led incremental clustering which is based on a careful analysis of the requirements of the information retrieval application, and which should also be useful in other applications. The goal is to efficiently maintain clusters of small diameter as new points are inserted. We analyze several natural greedy algorithms and demonstrate that they perform poorly. We propose new deterministic and randomized incremental clustering algorithms which have a provably good performance.