SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks 文章

ArXiv CS.CL2026-06-01NEWSen作者: Wai-Chung Kwan, Aryo Pradipta Gema, Joshua Ong Jun Leang, Pasquale Minervini

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks · 相关技术

相关技术