DyLLM: Efficient Diffusion LLM Inference via Saliency-based Token Selection and Partial Attention 文章

ArXiv CS.CL2026-06-02NEWSen作者: Younjoo Lee, Seungkyun Dan, Junghoo Lee, Jaiyoung Park, Jung Ho Ahn

DyLLM: Efficient Diffusion LLM Inference via Saliency-based Token Selection and Partial Attention · 相关事件