Native Audio-Visual Alignment for Generation 事件
OPEN_SOURCE2026-05-29影响: MEDIUM
Native Audio-Visual Alignment for Generation arXiv:2605.30073v1 Announce Type: new Abstract: Joint audio-video generation aims to synthesize temporally synchronized and semantically coherent visual-acoustic content. However, existing open-source methods mainly rely on either dual-tower designs with posterior alignment or fully unified tri-modal designs that mix textual context, audio and video in one shared space. The former weakens fine-grained audio-video co-evolution, while the latter couple
相关公司查看全部 (10)
相关报道查看全部 (1)
Native Audio-Visual Alignment for Generation
ArXiv CS.CV2026-05-29