A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL arXiv:2606.02398v1 Announce Type: cross Abstract: Reinforcement learning (RL) post-training improves large language models (LLMs) on individual domains such as mathematical reasoning, code generation, question answering, and creative writing (CW), but training on one domain often degrades performance on others. Existing explanations based on catastrophic forgetting or global gradient conflict are incomplet