AdaPLD: Adaptive Retrieval and Reuse for Efficient Model-Free Speculative Decoding 事件

PRODUCT_LAUNCH2026-06-05影响: MEDIUM

AdaPLD: Adaptive Retrieval and Reuse for Efficient Model-Free Speculative Decoding arXiv:2606.05742v1 Announce Type: new Abstract: Speculative decoding accelerates generation by verifying multiple drafted tokens in a single target-model forward pass, reducing sequential decoding iterations. Model-free variants avoid auxiliary draft models by reusing text and model states already available during generation, but their speedup depends on the reliability of the constructed drafts. We identify two