Fast-dLLM++: Fr\'{e}chet Profile Decoding for Faster Diffusion LLM Inference 事件

PRODUCT_LAUNCH2026-06-03影响: MEDIUM

Fast-dLLM++: Fr\'{e}chet Profile Decoding for Faster Diffusion LLM Inference arXiv:2606.02955v1 Announce Type: new Abstract: Diffusion large language models promise parallel token generation, yet inference remains bottlenecked by deciding which masked tokens can be safely committed together. Fast-dLLM addressed this with KV caching and confidence-guided parallel decoding, but its decoding theory uses a homogeneous high-confidence assumption that effectively reduces each candidate set to its wea