IAPO: Information-Aware Policy Optimization for Token-Efficient Reasoning 文章

ArXiv CS.CL2026-06-01NEWSen作者: Yinhan He, Yaochen Zhu, Mingjia Shi, Wendy Zheng, Lin Su, Xiaoqing Wang, Qi Guo, Jundong Li

IAPO: Information-Aware Policy Optimization for Token-Efficient Reasoning · 相关人物

暂无数据