AdaSD: Adaptive Speculative Decoding for Efficient Language Model Inference 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

AdaSD: Adaptive Speculative Decoding for Efficient Language Model Inference arXiv:2512.11280v2 Announce Type: replace Abstract: Large language models (LLMs) have achieved remarkable performance across a wide range of tasks, but their increasing parameter sizes significantly slow down inference. Speculative decoding mitigates this issue by leveraging a smaller draft model to predict candidate tokens, which are then verified by a larger target model. However, existing approaches often require add