xKV: Cross-Layer KV-Cache Compression via Aligned Singular Vector Extraction 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

xKV: Cross-Layer KV-Cache Compression via Aligned Singular Vector Extraction arXiv:2503.18893v2 Announce Type: replace Abstract: Long-context Large Language Models (LLMs) enable powerful applications but incur high memory costs due to the key-value states (KV-Cache). Recent studies attempt to share KV-Cache across layers, but these approaches either require expensive pretraining or rely on per-token cross-layer cosine similarity that is often limited in practice. We show, via Centered Kernel Al