Identifying Relations for Open Information Extraction 论文

2011引用 1152

Natural Language Processing TechniquesTopic ModelingWeb Data Mining and Analysis

Natural Language Processing Techniques Topic Modeling Web Data Mining and Analysis

作者

摘要

Open Information Extraction (IE) is the task of extracting assertions from massive corpora without requiring a pre-specified vocabulary. This paper shows that the output of state-ofthe-art Open IE systems is rife with uninformative and incoherent extractions. To overcome these problems, we introduce two simple syntactic and lexical constraints on binary relations expressed by verbs. We implemented the constraints in the REVERB Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TEXTRUNNER and WOE pos. More than 30 % of REVERB’s extractions are at precision 0.8 or higher— compared to virtually none for earlier systems. The paper concludes with a detailed analysis of REVERB’s errors, suggesting directions for future work. 1 1

作者查看全部 (3)

Oren Etzioni

Stephen Soderland

Anthony Fader

Identifying Relations for Open Information Extraction 论文

摘要

作者查看全部 (3)

相关技术查看全部 (2)

相关事件

相关文章