Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning 文章

ArXiv CS.CL2026-06-04NEWSen作者: Xuekang Wang, Zhuoyuan Hao, Shuo Hou, Hao Peng, Juanzi Li, Xiaozhi Wang

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning · 相关人物

暂无数据