ForesightKV: Optimizing KV Cache Eviction for Reasoning Models by Learning Long-Term Contribution 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

ForesightKV: Optimizing KV Cache Eviction for Reasoning Models by Learning Long-Term Contribution arXiv:2602.03203v2 Announce Type: replace Abstract: Recently, large language models (LLMs) have shown remarkable reasoning abilities by producing long reasoning traces. However, as the sequence length grows, the key-value (KV) cache expands linearly, incurring significant memory and computation costs. Existing KV cache eviction methods mitigate this issue by discarding less important KV pairs, but