PRO-CUA: Process-Reward Optimization for Computer Use Agents 文章

ArXiv CS.AI2026-05-29NEWSen作者: Yifei He, Rui Yang, Hao Bai, Tong Zhang, Han Zhao

PRO-CUA: Process-Reward Optimization for Computer Use Agents · 相关技术