Leveraging Routing Dynamics in Mixture-of-Experts Models for Efficient Language Adaptation 事件

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

Leveraging Routing Dynamics in Mixture-of-Experts Models for Efficient Language Adaptation arXiv:2605.29714v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models are widely used to scale language models, yet their expert routing behavior and adaptation in a multilingual setting remain underexplored. In this work, we study multilingual routing dynamics during continual pre-training of an English-centric MoE model on a multilingual corpus, analyzing how expert usage varies across languag

Leveraging Routing Dynamics in Mixture-of-Experts Models for Efficient Language Adaptation · 相关技术