Contrastive estimation 论文

2005引用 320

Topic ModelingNatural Language Processing TechniquesGenomics and Phylogenetic Studies

生物科技 Natural Language Processing Techniques Topic Modeling Genomics and Phylogenetic Studies

作者

摘要

Conditional random fields (Lafferty et al., 2001) are quite effective at sequence labeling tasks like shallow parsing (Sha and Pereira, 2003) and named-entity extraction (McCallum and Li, 2003). CRFs are log-linear, allowing the incorporation of arbitrary features into the model. To train on unlabeled data, we require unsupervised estimation methods for log-linear models; few exist. We describe a novel approach, contrastive estimation. We show that the new technique can be intuitively understood as exploiting implicit negative evidence and is computationally efficient. Applied to a sequence labeling problem---POS tagging given a tagging dictionary and unlabeled text---contrastive estimation outperforms EM (with the same feature set), is more robust to degradations of the dictionary, and can largely recover by modeling additional features.

作者查看全部 (2)

Jason Eisner

Noah A. Smith

Contrastive estimation 论文

摘要

作者查看全部 (2)

相关技术查看全部 (2)

相关事件

相关文章