Open information extraction: the second generation 论文

2011引用 447

Natural Language Processing TechniquesTopic ModelingWeb Data Mining and Analysis

Natural Language Processing Techniques Topic Modeling Web Data Mining and Analysis

作者

摘要

How do we scale information extraction to the massive size and unprecedented heterogeneity of the Web corpus? Beginning in 2003, our KnowItAll project has sought to extract high-quality knowledge from the Web. In 2007, we introduced the Open Information Extraction (Open IE) paradigm which eschews handlabeled training examples, and avoids domainspecific verbs and nouns, to develop unlexicalized, domain-independent extractors that scale to the Web corpus. Open IE systems have extracted billions of assertions as the basis for both commonsense knowledge and novel question-answering systems. This paper describes the second generation of Open IE systems, which rely on a novel model of how relations and their arguments are expressed in English sentences to double precision/recall compared with previous systems such as TEXTRUNNER and WOE. 1

作者查看全部 (5)

Oren Etzioni

Mausam Mausam

Stephen Soderland

Janara Christensen

Open information extraction: the second generation 论文

摘要

作者查看全部 (5)

相关技术查看全部 (2)

相关事件

相关文章