Deep Research as Rubric for Reinforcement Learning 文章

ArXiv CS.CL2026-06-02NEWSen作者: Wangyi Mei, Zhouhong Gu, Zhenhan Bai, Yin Cai, Lefan Zhang, Zhenxin Ding, Bo Chen, Yan Gao, Yi Wu, Yao Hu, Jiaqing Liang, Deqing Yang

Deep Research as Rubric for Reinforcement Learning · 相关技术