Web-scale information extraction in knowitall 论文

2004引用 751

Web Data Mining and AnalysisData Quality and ManagementAdvanced Text Analysis Techniques

Data Quality and Management Advanced Text Analysis Techniques Web Data Mining and Analysis

作者

摘要

Manually querying search engines in order to accumulate a large bodyof factual information is a tedious, error-prone process of piecemealsearch. Search engines retrieve and rank potentially relevantdocuments for human perusal, but do not extract facts, assessconfidence, or fuse information from multiple documents. This paperintroduces KnowItAll, a system that aims to automate the tedious process ofextracting large collections of facts from the web in an autonomous,domain-independent, and scalable manner.The paper describes preliminary experiments in which an instance of KnowItAll, running for four days on a single machine, was able to automatically extract 54,753 facts. KnowItAll associates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KnowItAll's architecture and reports on lessons learned for the design of large-scale information extraction systems.

作者查看全部 (9)

Oren Etzioni

Alexander Yates

Daniel S. Weld

Stephen Soderland

Web-scale information extraction in knowitall 论文

摘要

作者查看全部 (9)

相关技术

相关事件

相关文章