Quantized Keys Steal Attention: Bias Correction for KV-Cache Compression in Video Diffusion 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
Quantized Keys Steal Attention: Bias Correction for KV-Cache Compression in Video Diffusion arXiv:2605.26266v1 Announce Type: cross Abstract: Chunk-wise autoregressive video diffusion models rely on a KV cache of previously generated chunks to avoid redundant computation, but this cache quickly becomes a memory bottleneck as videos grow longer. Methods that quantize the KV cache to low bitwidths reduce memory pressure but degrade video quality. We show that a key driver of this degradation is a