Agentic Active Omni-Modal Perception for Multi-Hop Audio-Visual Reasoning 事件
OPEN_SOURCE2026-05-28影响: MEDIUM
Agentic Active Omni-Modal Perception for Multi-Hop Audio-Visual Reasoning arXiv:2605.28192v1 Announce Type: new Abstract: Multi-hop audio-visual reasoning remains challenging for Omni-LLMs, as relevant evidence is often sparse, temporally dispersed, and distributed across both audio and visual streams. Existing benchmarks provide limited investigation of this setting, typically involving only a limited number of modalities, relevant temporal segments, or reasoning steps. In this work, we introd
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
Agentic Active Omni-Modal Perception for Multi-Hop Audio-Visual Reasoning
ArXiv CS.AI2026-05-28