SFT Overtraining Predicts Rank Inversion via Entropy Collapse Under RLVR 文章

ArXiv CS.CL2026-06-18NEWSen作者: Siddharth Aphale, Kelly Liu

SFT Overtraining Predicts Rank Inversion via Entropy Collapse Under RLVR · 相关技术

相关技术