ARES: Automated Rubric Synthesis for Scalable LLM Reinforcement Learning 文章

ArXiv CS.CL2026-05-26NEWSen作者: Xiaoyuan Li, Keqin Bao, Moxin Li, Yubo Ma, Yichang Zhang, Wenjie Wang, Fuli Feng, Dayiheng Liu

ARES: Automated Rubric Synthesis for Scalable LLM Reinforcement Learning · 相关技术

暂无数据