LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding arXiv:2602.23881v2 Announce Type: replace-cross Abstract: Speculative decoding accelerates autoregressive large language model (LLM) inference by using a lightweight draft model to propose candidate tokens that are then verified in parallel by the target model. The speedup is significantly determined by the acceptance rate, yet standard training minimizes Kullback-Leibler (KL) divergence as a proxy objective. While KL diver