WhiFlash: Accelerating Speculative Decoding with Token-Level Cross-Paradigm Routing 文章

ArXiv CS.AI2026-06-09NEWSen作者: Young D. Kwon, Miles Williams, Rui Li, Alexandros Kouris, Stylianos I. Venieris

WhiFlash: Accelerating Speculative Decoding with Token-Level Cross-Paradigm Routing · 相关技术