详细信息
- 来源站点
- ArXiv CS.CL
- 作者
- Seth Aycock, Fedor Vitiugin, Aleksandr Umnov, Christof Monz, Khalil Sima'an
- 文章类型
- NEWS
- 语言
- en
- 发布日期
- 2026-05-26
摘要
arXiv:2605.25846v1 Announce Type: new Abstract: Endowing models with consistent multilingual performance can be achieved by mixing pre-training data, or post-training approaches such as language-specific model merging. In this work, we test whether merging can be applied to monolingually pre-trained models. We conduct a controlled study on the efficacy of mixed, merged, and monolingual pre-training setups. We find that while monolingual pre-training results in strong in-language performance, merging any combination of monolingual models leads to performance collapse due to interference. Our analysis suggests representational similarity is a prerequisite for model merging. We therefore conclude that the flexibility of merging in fine-tuning does not extend trivially to language-specific pre-training.
相关事件
暂无数据
相关公司
暂无数据
相关人物
暂无数据
相关产品
暂无数据
相关技术
暂无数据