Backtranslation Augmented Direct Preference Optimization for Neural Machine Translation 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Backtranslation Augmented Direct Preference Optimization for Neural Machine Translation arXiv:2604.25702v2 Announce Type: replace Abstract: Contemporary neural machine translation (NMT) systems are almost exclusively built by training on supervised parallel data. Despite the tremendous progress achieved, these systems still exhibit persistent translation errors. This paper proposes that a post-training paradigm based on reinforcement learning (RL) can effectively rectify such mistakes. We intro