NestedKV: Nested Memory Routing for Long-Context KV Cache Compression 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
NestedKV: Nested Memory Routing for Long-Context KV Cache Compression arXiv:2605.26678v1 Announce Type: new Abstract: Long-context language models are limited by the memory footprint of the key-value (KV) cache. Existing training-free KV compression methods usually rank tokens by one importance signal -- attention, recency, layer-wise allocation, or key distinctiveness -- which becomes brittle when useful context is globally distinctive, locally episodic, or immediately relevant. We introduce N
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
NestedKV: Nested Memory Routing for Long-Context KV Cache Compression
ArXiv CS.CL2026-05-27