Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts 文章

ArXiv CS.CL2026-05-28NEWSen作者: Liu O. Martin, Lucas Bandarkar, Nanyun Peng

摘要

arXiv:2605.28042v1 Announce Type: new Abstract: Modern large language models (LLMs) achieve state-of-the-art machine translation performance, but they do so as broad generalists largely trained for many tasks and capabilities unrelated to translation. Thus, they are heavily overparameterized for this task, resulting in excessive memory and compute requirements. In this paper, we present a method for aggressively pruning experts from modern mixture-of-experts LLMs while incurring negligible degradation in translation quality. Our approach exploits expert specialization and the separability of multilingual capabilities in LLMs to identify experts irrelevant to translation. And because of the modular nature of MoEs, these can be easily pruned without any training. Without retraining, we are able to prune half of all experts with negligible degradation and 70% with only minor losses.

Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts 文章

摘要

相关事件查看全部 (2)

相关公司

相关人物

相关产品

相关技术查看全部 (2)