AdaptR1: Reinforcement Learning Based Adaptive Interleaved Thinking in Multi-hop Question Answering 文章

ArXiv CS.CL2026-06-01NEWSen作者: Yuxin Wang, Jiahao Lu, Qifeng Wu, Shicheng Fang, Chuanyuan Tan, Yining Zheng, Xuanjing Huang, Xipeng Qiu

AdaptR1: Reinforcement Learning Based Adaptive Interleaved Thinking in Multi-hop Question Answering · 相关技术