GRKV: Global Regression for Training-Free KV Cache Compression in Long-Context LLMs 事件
PRODUCT_LAUNCH2026-06-01影响: MEDIUM
GRKV: Global Regression for Training-Free KV Cache Compression in Long-Context LLMs arXiv:2605.31105v1 Announce Type: new Abstract: Large language models (LLMs) with extended context lengths rely on the key-value (KV) cache to support attention over prior tokens. However, maintaining the KV cache incurs substantial memory overhead, motivating KV-cache compression methods that enforce a fixed budget through eviction and merging. Modern eviction methods increasingly adopt span-based retention bec
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
GRKV: Global Regression for Training-Free KV Cache Compression in Long-Context LLMs
ArXiv CS.CL2026-06-01