IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference 文章

ArXiv CS.CL2026-05-26NEWSen作者: Xintong Yang, Hao Gu, Binxing Xu, Lujun Li, Bei Liu, Jiacheng Liu, Qiyuan Zhu, Sirui Han, Yike Guo

IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference · 相关技术