Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time 文章

ArXiv CS.CL2026-06-04NEWSen作者: Wang Yang, Xiang Yue, Vipin Chaudhary, Xiaotian Han

摘要

arXiv:2504.12329v2 Announce Type: replace Abstract: Recent advances leverage post-training to enhance model reasoning performance, which typically requires costly training pipelines and still suffers from inefficient, overly lengthy outputs. We introduce Speculative Thinking, a training-free framework that enables large reasoning models to guide smaller ones during inference at the reasoning level, distinct from speculative decoding, which operates at the token level. Our approach is based on two observations: (1) reasoning-supportive tokens such as "wait" frequently appear after structural delimiters like "\n\n", serving as signals for reflection or continuation; and (2) larger models exhibit stronger control over reflective behavior, reducing unnecessary backtracking while improving reasoning quality. By strategically delegating reflective steps to a more capable model, our method significantly boosts the reasoning accuracy of reasoning models while shortening their output.

Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (1)