To Be Multimodal or Not to Be: Query-Adaptive Audio-Visual Person Retrieval via Active Modality Detection 事件

Name: To Be Multimodal or Not to Be: Query-Adaptive Audio-Visual Person Retrieval via Active Modality Detection
Start: 2026-06-05

PRODUCT_LAUNCH2026-06-05影响: MEDIUM

To Be Multimodal or Not to Be: Query-Adaptive Audio-Visual Person Retrieval via Active Modality Detection arXiv:2606.05931v1 Announce Type: cross Abstract: When retrieving a person from a video archive by voice and face, should the system be multimodal or not? In real-world broadcast archives, unlike curated benchmarks, a target may be heard but unseen, seen but unheard, or both. Fusing scores from an absent modality injects noise, degrading precision below the best unimodal system. We propose

人工智能

关系图谱

To Be Multimodal or Not to Be: Query-Adaptive Audio-Visual Person Retrieval via Active Modality Detection 事件

相关公司查看全部 (10)

相关人物

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)