When RL Suppresses Its Own Vocabulary: Recovering Reasoning Diversity in Puzzle-to-Math Transfer 文章

ArXiv CS.CL2026-05-29NEWSen作者: Mayug Maniparambil, Arjun Karuvally, Terrence Sejnowski, Fergal Reid

When RL Suppresses Its Own Vocabulary: Recovering Reasoning Diversity in Puzzle-to-Math Transfer · 相关技术