OccamToken: Efficient VLM Inference with Training-Free and Budget-Adaptive Token Pruning 事件

Name: OccamToken: Efficient VLM Inference with Training-Free and Budget-Adaptive Token Pruning
Start: 2026-05-29

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

OccamToken: Efficient VLM Inference with Training-Free and Budget-Adaptive Token Pruning arXiv:2605.29657v1 Announce Type: new Abstract: Vision-language models (VLMs) rely on long visual token sequences for visual understanding, making the prefill stage expensive in both computation and memory. Most existing pruning methods follow an absolute-ranking paradigm, assigning importance scores to visual tokens and retaining a fixed top-K subset. In this work, we argue that this paradigm is fundamenta

人工智能

关系图谱

OccamToken: Efficient VLM Inference with Training-Free and Budget-Adaptive Token Pruning 事件

OccamToken: Efficient VLM Inference with Training-Free and Budget-Adaptive Token Pruning · 相关技术

相关技术