Deep Research as Rubric for Reinforcement Learning 文章

ArXiv CS.CL2026-06-02NEWSen作者: Wangyi Mei, Zhouhong Gu, Zhenhan Bai, Yin Cai, Lefan Zhang, Zhenxin Ding, Bo Chen, Yan Gao, Yi Wu, Yao Hu, Jiaqing Liang, Deqing Yang

Deep Research as Rubric for Reinforcement Learning · 相关人物

暂无数据