SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding 事件

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding arXiv:2604.09557v2 Announce Type: replace-cross Abstract: Speculative Decoding (SD) has emerged as a critical technique for accelerating Large Language Model (LLM) inference. Unlike deterministic system optimizations, SD performance is inherently data-dependent, meaning that diverse and representative workloads are essential for accurately measuring its effectiveness. Existing benchmarks suffer from limited task diversity, in