11,001 new features for statistical machine translation 论文

2009引用 239
Natural Language Processing TechniquesTopic ModelingAlgorithms and Data Compression

摘要

We use the Margin Infused Relaxed Algorithm of Crammer et al. to add a large number of new features to two machine translation systems: the Hiero hierarchical phrase-based translation system and our syntax-based translation system. On a large-scale Chinese-English translation task, we obtain statistically significant improvements of +1.5 Bleu and + 1.1 Bleu, respectively. We analyze the impact of the new features and the performance of the learning algorithm.