Effective self-training for parsing 论文
2006引用 619
Natural Language Processing TechniquesTopic ModelingText Readability and Simplification
摘要
We present a simple, but surprisingly effective, method of self-training a twophase parser-reranker system using readily available unlabeled data. We show that this type of bootstrapping is possible for parsing when the bootstrapped parses are processed by a discriminative reranker. Our improved model achieves an f -score of 92.1%, an absolute 1.1% improvement (12% error reduction) over the previous best result for Wall Street Journal parsing. Finally, we provide some analysis to better understand the phenomenon.