A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL 文章

ArXiv CS.CL2026-06-02NEWSen作者: Lei Yang, Siyu Ding, Deyi Xiong

摘要

arXiv:2606.02398v1 Announce Type: cross Abstract: Reinforcement learning (RL) post-training improves large language models (LLMs) on individual domains such as mathematical reasoning, code generation, question answering, and creative writing (CW), but training on one domain often degrades performance on others. Existing explanations based on catastrophic forgetting or global gradient conflict are incomplete: substantial interference can occur even when full-model gradients are nearly orthogonal. We show that single-domain RL produces sparse, small-magnitude parameter edits with weak overlap among top-changed neurons, while different domains still share substantial active computation routes on which update directions determine whether they act synergistically or conflict.

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (4)