Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models 文章

ArXiv CS.CL2026-05-29NEWSen作者: Zizhuo Lin, Quanling Liu, Jinsheng Quan, Chao Zhang, Yifan Zhu, Xing Shi, Jingtao Xu, Zhihui Li, Yawei Luo

Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models · 相关技术