DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding 事件

Name: DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding
Start: 2026-06-02

BREAKTHROUGH2026-06-02影响: HIGH

DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding arXiv:2606.02091v1 Announce Type: new Abstract: Block diffusion speculative decoding accelerates LLM inference by predicting all tokens within a block simultaneously for the target model to verify in parallel. Predicting an entire block at once requires a sufficiently capable draft model and effective utilization of the target model's internal knowledge. However, the state-of-the-art method DFlash constrains all draft la

人工智能

关系图谱

DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding 事件

相关公司查看全部 (10)

相关人物查看全部 (1)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)