Your Multimodal Speech Model Says I Have a Face for Radio 事件
PRODUCT_LAUNCH2026-06-01影响: MEDIUM
Your Multimodal Speech Model Says I Have a Face for Radio arXiv:2605.30472v1 Announce Type: new Abstract: As large neural models have become better at language tasks, researchers are increasingly building multi- and omnimodal models that handle more modalities of data. One example is the expansion of speech recognition models to audio-visual data for noise mitigation and multimodal subtitling. While performance and bias have been studied extensively in the single-modality regime, it is unknown
相关产品查看全部 (10)
相关报道查看全部 (1)
Your Multimodal Speech Model Says I Have a Face for Radio
ArXiv CS.CL2026-06-01