MOSS-Video-Preview: Toward Real-Time Video Understanding via Cross-Attention 事件

PRODUCT_LAUNCH2026-06-09影响: MEDIUM

MOSS-Video-Preview: Toward Real-Time Video Understanding via Cross-Attention arXiv:2606.07639v1 Announce Type: new Abstract: Video understanding is shifting from the offline paradigm -- taking a fully recorded video as input and producing a single answer after it ends -- toward real-time interaction, in which the model perceives new frames while still replying, revises its answer as new evidence appears, and remains silent when there is nothing to say. We present MOSS-Video-Preview to validate