KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem 事件

PRODUCT_LAUNCH2026-06-03影响: MEDIUM

KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem arXiv:2602.20217v2 Announce Type: replace-cross Abstract: Self-speculative decoding (SSD) accelerates LLM inference by skipping layers to create an efficient draft model, yet existing methods often rely on static heuristics that ignore the dynamic computational overhead of attention in long-context scenarios. We propose KnapSpec, a training-free framework that reformulates draft model selection as a knapsack