ECHO: Entropy-Confidence Hybrid Optimization for Test-Time Reinforcement Learning 文章

ArXiv CS.AI2026-05-28NEWSen作者: Chu Zhao, Enneng Yang, Yuting Liu, Jianzhe Zhao, Guibing Guo

ECHO: Entropy-Confidence Hybrid Optimization for Test-Time Reinforcement Learning · 相关人物

暂无数据