Efficient Post-training of LLMs for Code Generation With Offline Reinforcement Learning 文章

ArXiv CS.AI2026-05-28NEWSen作者: Mingze Wu, Abhinav Anand, Shweta Verma, Mira Mezini

Efficient Post-training of LLMs for Code Generation With Offline Reinforcement Learning · 相关人物

暂无数据