Exploiting Wikipedia as External Knowledge for Named Entity Recognition 论文

2007引用 277
Natural Language Processing TechniquesTopic ModelingWikis in Education and Collaboration

摘要

We explore the use of Wikipedia as external knowledge to improve named entity recognition (NER). Our method retrieves the corresponding Wikipedia entry for each candidate word sequence and extracts a category label from the first sentence of the entry, which can be thought of as a definition part. These category labels are used as features in a CRF-based NE tagger. We demonstrate using the CoNLL 2003 dataset that the Wikipedia category labels extracted by such a simple method actually improve the accuracy of NER.