Parallel implementations of word alignment tool 论文

2008引用 369
Natural Language Processing TechniquesTopic ModelingMathematics, Computing, and Information Processing

详细信息

发表日期
2008-01-01
发表年份
2008

关键词

Natural Language Processing TechniquesTopic ModelingMathematics, Computing, and Information Processing

摘要

Training word alignment models on large corpora is a very time-consuming processes. This paper describes two parallel implementations of GIZA++ that accelerate this word alignment process. One of the implementations runs on computer clusters, the other runs on multi-processor system using multi-threading technology. Results show a near-linear speed-up according to the number of CPUs used, and alignment quality is preserved.