IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference 事件

Name: IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference
Start: 2026-05-26

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference arXiv:2605.25475v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly expected to operate over long contexts, yet standard softmax attention incurs a KV cache that grows linearly with sequence length, quickly becoming the bottleneck for long context inference. A practical remedy is to evict less important KV entries; however, existing eviction policies are largely heuristic and struggle

人工智能

关系图谱

IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference 事件

相关公司查看全部 (10)

相关人物查看全部 (2)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)