Better Accuracies, Worse Reasoning: A Step-Level Audit of Medical Chain-of-Thought Distillation 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

Better Accuracies, Worse Reasoning: A Step-Level Audit of Medical Chain-of-Thought Distillation arXiv:2605.28301v1 Announce Type: new Abstract: Chain-of-thought (CoT) distillation trains a smaller model to imitate a teacher's reasoning trace, but it is typically evaluated by final-answer metrics including accuracy. We ask whether gains in answer quality are accompanied by improvements in the trace. In medical QA, where short answer options can leave a richer clinical justification under-specifi