EarlyTom: Early Token Compression Completes Fast Video Understanding 事件

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

EarlyTom: Early Token Compression Completes Fast Video Understanding arXiv:2605.30010v1 Announce Type: new Abstract: Video large language models (Video-LLMs) have demonstrated strong capabilities in video understanding tasks. However, their practical deployment is still hindered by the inefficiency introduced by processing massive amounts of visual tokens. Although recent approaches achieve extremely low token retention ratios while maintaining accuracy comparable to full-token baselines, most

EarlyTom: Early Token Compression Completes Fast Video Understanding · 相关技术