Simple Semi-supervised Dependency Parsing 论文

2008RECERCAT (Consorci de Serveis Universitaris de Catalunya)引用 444

Natural Language Processing TechniquesTopic ModelingText Readability and Simplification

Natural Language Processing Techniques Topic Modeling Text Readability and Simplification

作者

摘要

We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and Prague Dependency Treebank, and we show that the cluster-based features yield substantial gains in performance across a wide range of conditions. For example, in the case of English unlabeled second-order parsing, we improve from a baseline accuracy of 92:02% to 93:16%, and in the case of Czech unlabeled second-order parsing, we improve from a baseline accuracy of 86:13% to 87:13%. In addition, we demonstrate that our method also improves performance when small amounts of training data are available, and can roughly halve the amount of supervised data required to reach a desired level of performance.

作者查看全部 (3)

Michael J. Collins

Xavier Carreras

Terry Koo

Simple Semi-supervised Dependency Parsing 论文

详细信息

摘要

作者查看全部 (3)

相关技术查看全部 (2)

相关事件

相关文章