VideoFDB: Evaluating Full-Duplex Vision-Speech Capabilities in Conversational Agents 事件
PRODUCT_LAUNCH2026-05-29影响: MEDIUM
VideoFDB: Evaluating Full-Duplex Vision-Speech Capabilities in Conversational Agents arXiv:2605.30256v1 Announce Type: new Abstract: Natural human conversation is full-duplex and audio-visual: people simultaneously speak and listen while continuously interpreting and producing nonverbal cues, such as nods, smiles, and gestures. To support successful human-agent interaction, agents must model full-duplex audiovisual conversation; however, existing full-duplex benchmarks evaluate only speech. In
相关公司查看全部 (10)
相关产品查看全部 (10)
相关报道查看全部 (1)
VideoFDB: Evaluating Full-Duplex Vision-Speech Capabilities in Conversational Agents
ArXiv CS.CV2026-05-29