CriticalKV: Optimizing KV Cache Eviction from an Output Perturbation Perspective 事件

Name: CriticalKV: Optimizing KV Cache Eviction from an Output Perturbation Perspective
Start: 2026-05-29

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

CriticalKV: Optimizing KV Cache Eviction from an Output Perturbation Perspective arXiv:2502.03805v2 Announce Type: replace Abstract: Large language models have revolutionized natural language processing but face significant challenges of high storage and runtime costs, due to the transformer architecture's reliance on self-attention, particularly the large KV cache for long-sequence inference. Recent efforts to reduce KV cache size by pruning less critical entries based on attention weights rem

人工智能

关系图谱

CriticalKV: Optimizing KV Cache Eviction from an Output Perturbation Perspective 事件

相关公司查看全部 (8)

相关人物查看全部 (1)

相关产品查看全部 (10)

相关技术查看全部 (9)

相关报道查看全部 (1)