Verifying Meta-Awareness via Predictive Rewards in Reasoning Models 文章

ArXiv CS.AI2026-06-02NEWSen作者: Yoonjeon Kim, Doohyuk Jang, Eunho Yang

Verifying Meta-Awareness via Predictive Rewards in Reasoning Models · 相关技术

相关技术