Draft-OPD: On-Policy Distillation for Speculative Draft Models 事件

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

Draft-OPD: On-Policy Distillation for Speculative Draft Models arXiv:2605.29343v1 Announce Type: new Abstract: Speculative decoding accelerates large language model inference by pairing a target model with a lightweight draft model whose proposed tokens are verified in parallel. A common way to build draft models, like EAGLE3 or DFlash is supervised fine-tuning (SFT) on target-generated trajectories. However, we observe that SFT quickly plateaus: the draft model's acceptance length on test data

Draft-OPD: On-Policy Distillation for Speculative Draft Models · 相关人物

暂无数据