摘要
arXiv:2603.04678v3 Announce Type: replace Abstract: Language models often respond inconsistently to translation-equivalent prompts across languages, undermining the reliability of multilingual systems. To quantify this, we give an information-theoretic definition of crosslingual consistency as a divergence bound between a model's response distribution and its round-trip pushforward across languages. We then introduce penalized consistency optimization (PCO), a post-training procedure that couples this divergence with a Kullback-Leibler penalty to a fixed reference language model. Because direct optimization of PCO requires expensive on-policy roll-outs, we propose a tractable surrogate, direct consistency optimization (DCO), which can be optimized off-policy. Across diverse language models and 26 languages, DCO significantly improves crosslingual consistency, outperforms existing methods, and enables targeted alignment of low-resource languages.
相关事件查看全部 (1)
相关公司
暂无数据
相关人物
暂无数据
相关产品
暂无数据