Tracking and summarizing news on a daily basis with Columbia's Newsblaster 论文

2002引用 233
Web Data Mining and AnalysisNatural Language Processing TechniquesAlgorithms and Data Compression

摘要

Recently, there have been significant advances in several areas of language technology, including clustering, text categorization, and summarization. However, efforts to combine technology from these areas in a practical system for information access have been limited. In this paper, we present Columbia's Newsblaster system for online news summarization. Many of the tools developed at Columbia over the years are combined together to produce a system that crawls the web for news articles, clusters them on specific topics and produces multidocument summaries for each cluster.