Alignment Risks from Capability-Seeking RL Training 文章

ArXiv CS.CL2026-06-05NEWSen作者: Yujun Zhou, Yue Huang, Han Bao, Kehan Guo, Zhenwen Liang, Pin-Yu Chen, Tian Gao, Werner Geyer, Nuno Moniz, Nitesh V Chawla, Xiangliang Zhang

Alignment Risks from Capability-Seeking RL Training · 相关技术