Annotated Gigaword 论文

2012引用 227

Natural Language Processing TechniquesTopic ModelingSpeech and dialogue systems

Natural Language Processing Techniques Topic Modeling Speech and dialogue systems

作者

摘要

We have created layers of annotation on the English Gigaword v.5 corpus to render it useful as a standardized corpus for knowledge extraction and distributional semantics. Most existing large-scale work is based on inconsistent corpora which often have needed to be re-annotated by research teams independently, each time introducing biases that manifest as results that are only comparable at a high level. We provide to the community a public reference set based on current state-of-the-art syntactic analysis and coreference resolution, along with an interface for programmatic access. Our goal is to enable broader involvement in large-scale knowledge-acquisition efforts by researchers that otherwise may not have had the ability to produce such a resource on their own.

作者查看全部 (3)

Benjamin Van Durme

Matthew R. Gormley

Courtney Napoles

Annotated Gigaword 论文

摘要

作者查看全部 (3)

相关技术查看全部 (2)

相关事件

相关文章