dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching 事件

PRODUCT_LAUNCH2026-06-03影响: MEDIUM

dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching arXiv:2506.06295v2 Announce Type: replace-cross Abstract: Autoregressive Models (ARMs) have long dominated the landscape of Large Language Models. Recently, a new paradigm has emerged in the form of diffusion-based Large Language Models (dLLMs), which generate text by iteratively denoising masked segments. This approach has shown significant advantages and potential. However, dLLMs suffer from high inference latency.