Sentence Simplification by Monolingual Machine Translation 论文

2012Research portal (Tilburg University)引用 239
Text Readability and SimplificationNatural Language Processing TechniquesTopic Modeling

详细信息

发表期刊/会议
Research portal (Tilburg University)
发表日期
2012-07-08
发表年份
2012

关键词

Text Readability and SimplificationNatural Language Processing TechniquesTopic Modeling

摘要

In this paper we describe a method for simplifying sentences using Phrase Based Machine Translation, augmented with a re-ranking heuristic based on dissimilarity, and trained on a monolingual parallel corpus. We compare our system to a word-substitution baseline and two state-of-the-art systems, all trained and tested on paired sentences from the English part of Wikipedia and Simple Wikipedia. Human test subjects judge the output of the different systems. Analysing the judgements shows that by relatively careful phrase-based paraphrasing our model achieves similar simplification results to state-of-the-art systems, while generating better formed output. We also argue that text readability metrics such as the Flesch-Kincaid grade level should be used with caution when evaluating the output of simplification systems. 1