Beyond the Target: From Imitation to Collaboration in Speculative Decoding 文章

ArXiv CS.CL2026-05-26NEWSen作者: Jinze Li, Yixing Xu, Guanchen Li, Jinfeng Xu, Shuo Yang, Yang Zhang, Xuanwu Yin, Dong Li, Edith C. H. Ngai, Emad Barsoum

查看原文 →

关系图谱

摘要

arXiv:2605.24793v1 Announce Type: new Abstract: Speculative decoding (SPD) accelerates large language model (LLM) inference by letting a smaller draft model propose multiple future tokens that are verified in parallel by a larger target model. The dominant SPD paradigm treats the target model as the sole reliable teacher, accepting a draft token only when it exactly matches the target prediction. This design implicitly assumes that the target is always the better choice at every position. In practice, this assumption does not hold. Although the draft is the weaker model overall, it is not uniformly inferior at the token level. In a meaningful fraction of cases where draft and target disagree, the draft's choice is the one that leads to the correct final answer. Inspired by this, we introduce \textbf{Collaborative Speculative Decoding (CoSpec)}, a generalization of SPD that no longer treats the target model as the sole token-level authority.

Beyond the Target: From Imitation to Collaboration in Speculative Decoding 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (5)