TIAR: Trajectory-Informed Advantage Reweighting for LLM Abstention Learning 文章

ArXiv CS.CL2026-05-26NEWSen作者: Muyu Pan, Shu Zhao, Nan Zhang, Philip Shin, Varun Parekh, Vijaykrishnan Narayanan, Rui Zhang

TIAR: Trajectory-Informed Advantage Reweighting for LLM Abstention Learning · 相关事件

相关事件