StreamingVLM: Real-Time Understanding for Infinite Video Streams 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

StreamingVLM: Real-Time Understanding for Infinite Video Streams arXiv:2510.09608v2 Announce Type: replace Abstract: Vision-language models (VLMs) could power real-time assistants and autonomous agents, but they face a critical challenge: understanding near-infinite video streams without escalating latency and memory usage. Processing entire videos with full attention leads to quadratic computational costs and poor performance on long videos. Meanwhile, simple sliding window methods are also fl