Rethinking the Multilingual Reasoning Gap with Layer Swap 文章

ArXiv CS.CL2026-05-27NEWSen作者: Maxence Lasbordes, Am\'elie Chatelain, Djam\'e Seddah

摘要

arXiv:2605.26735v1 Announce Type: new Abstract: Recent reasoning Large Language Models produce a chain-of-thought (CoT) predominantly in English, even when prompted in non-English languages. Prior work suggests that forcing the CoT to remain in the input language (\emph{native reasoning}) substantially degrades performance relative to allowing the model to reason in English before answering in the input language (\emph{English-pivoted reasoning}). However, most studies of this native reasoning gap rely on inference-time interventions or limited native-language training data. We revisit this comparison at a larger scale and under comparable supervision. We construct long multilingual reasoning datasets across six languages (English, French, German, Spanish, Chinese and Swahili); fine-tune specialists in both native and English-pivoted regimes on top of \texttt{Qwen/Qwen3-8B-Base}, and evaluate across mathematics, science, general knowledge, and code.

Rethinking the Multilingual Reasoning Gap with Layer Swap 文章

摘要

相关事件查看全部 (1)

相关公司查看全部 (3)

相关人物

相关产品查看全部 (14)

相关技术查看全部 (25)