Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment arXiv:2601.14249v5 Announce Type: replace Abstract: Long chain-of-thought (CoT) trajectories provide rich supervision signals for distilling reasoning from teacher to student LLMs. However, both prior work and our experiments show that trajectories from stronger teachers do not necessarily yield better students, highlighting the importance of data-student suitability in distillation. Existing

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment · 相关技术