Efficient Diffusion LLMs via Temporal-Spatial Parallel Decoding and Confidence Extrapolation 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

Efficient Diffusion LLMs via Temporal-Spatial Parallel Decoding and Confidence Extrapolation arXiv:2605.30753v1 Announce Type: new Abstract: Diffusion-based large language models (dLLMs) support parallel text generation via iterative denoising, yet inference remains latency-heavy because many steps are spent on redundant refinement and repeated remasking of tokens whose final values are already determined. Prior acceleration methods mainly depend on step-local confidence heuristics or fixed sch