HACK++: Towards More Effective Head-Aware Key-Value Compression for Efficient Visual Autoregressive Modeling 事件

PRODUCT_LAUNCH2026-06-09影响: MEDIUM

HACK++: Towards More Effective Head-Aware Key-Value Compression for Efficient Visual Autoregressive Modeling arXiv:2606.08302v1 Announce Type: new Abstract: Visual Autoregressive (VAR) models adopt a next-scale prediction paradigm, offering high-quality generation with substantially fewer decoding steps. However, existing VAR models suffer from significant attention complexity and severe memory overhead due to the accumulation of key-value (KV) caches across scales. In this paper, we tackle thi