LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs arXiv:2605.17260v2 Announce Type: replace Abstract: The fundamental challenge in scaling Video Large Language Models (Video LLMs) to long-form video lies in managing the explosion of visual-token context length. Existing strategies predominantly focus on "post-hoc" token reduction -- reducing visual tokens after feature extraction to alleviate the LLM's computational overhead. While these methods effectively reduce the numb

LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs · 相关报道