Learning to Adapt SFT Data for Better Reasoning Generalization 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

Learning to Adapt SFT Data for Better Reasoning Generalization arXiv:2605.26924v1 Announce Type: new Abstract: Large language models (LLMs) have achieved remarkable progress, with post-training playing a crucial role in enhancing their reasoning capabilities. Among post-training paradigms, supervised fine-tuning (SFT) is widely used: it leverages external data to provide dense supervision and enables efficient training. However, directly fine-tuning on expert data can hurt generalization when t

Learning to Adapt SFT Data for Better Reasoning Generalization · 相关公司

V
ViseCOMPANY
A
arXivNONPROFIT
I
IRECNONPROFIT
E
EARNNONPROFIT
A
ACTNONPROFIT
R
RatioRESEARCH_INSTITUTE