Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings 论文
2015引用 402
Topic ModelingNatural Language Processing TechniquesText and Document Classification Technologies
摘要
We consider the task of named entity recognition for Chinese social media. The long line of work in Chinese NER has fo-cused on formal domains, and NER for social media has been largely restricted to English. We present a new corpus of Weibo messages annotated for both name and nominal mentions. Additionally, we evaluate three types of neural embeddings for representing Chinese text. Finally, we propose a joint training objective for the embeddings that makes use of both (NER) labeled and unlabeled raw text. Our meth-ods yield a 9 % improvement over a state-of-the-art baseline. 1