Efficient Diffusion LLMs via Temporal-Spatial Parallel Decoding and Confidence Extrapolation 文章

ArXiv CS.CL2026-06-01NEWSen作者: Zekai Li, Ji Liu, Yiqing Huang, Ziqiong Liu, Dong Li, Emad Barsoum

摘要

arXiv:2605.30753v1 Announce Type: new Abstract: Diffusion-based large language models (dLLMs) support parallel text generation via iterative denoising, yet inference remains latency-heavy because many steps are spent on redundant refinement and repeated remasking of tokens whose final values are already determined. Prior acceleration methods mainly depend on step-local confidence heuristics or fixed schedules, which are sensitive to prompt and task variation and ignore strong positional effects within a sequence. We cast diffusion decoding as a dynamic control problem and show that token-wise denoising trajectories provide the key signal for reliable control. We propose a trace-aware decoding framework with two components.

Efficient Diffusion LLMs via Temporal-Spatial Parallel Decoding and Confidence Extrapolation 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (2)