Failed Reasoning Traces Tell You What Is Fixable (But Not by Reading Them) 事件

PRODUCT_LAUNCH2026-06-04影响: MEDIUM

Failed Reasoning Traces Tell You What Is Fixable (But Not by Reading Them) arXiv:2606.05145v1 Announce Type: cross Abstract: When post-trained language models fail on reasoning problems, the common test-time-scaling response is to spend more compute on additional attempts, and the failed traces play no further role. We argue this discards a crucial signal; some failures come from unlucky sampling, where more rollouts help, while others are structural and resist resampling regardless of budget.