Proact-VL: A Proactive VideoLLM for Real-Time AI Companions 文章

ArXiv CS.CV2026-05-26NEWSen作者: Weicai Yan, Yuhong Dai, Qi Ran, Haodong Li, Wang Lin, Tao Jin, Xing Xie, Hao Liao, Jianxun Lian

查看原文 →

关系图谱

摘要

arXiv:2603.03447v3 Announce Type: replace Abstract: Proactive and real-time interactive experiences are essential for human-like AI companions, yet face three key challenges: (1) achieving low-latency inference under continuous streaming inputs, (2) autonomously deciding when to respond, and (3) controlling both quality and quantity of generated content to meet real-time constraints. In this work, we instantiate AI companions through two gaming scenarios, commentator and guide, selected for their suitability for automatic evaluation. We introduce the Live Gaming Benchmark, a large-scale dataset with three representative scenarios: solo commentary, co-commentary, and user guidance, and present Proact-VL, a general framework that shapes multimodal language models into proactive, real-time interactive agents capable of human-like environment perception and interaction.

Proact-VL: A Proactive VideoLLM for Real-Time AI Companions 文章

摘要

相关事件查看全部 (1)

相关公司查看全部 (3)

相关人物

相关产品查看全部 (4)

相关技术查看全部 (21)