Inference-Time Scaling for Joint Audio-Video Generation 事件
PRODUCT_LAUNCH2026-06-03影响: MEDIUM
Inference-Time Scaling for Joint Audio-Video Generation arXiv:2606.03183v1 Announce Type: cross Abstract: Joint audio-video generation aims to synthesize realistic audio-video pairs that are both semantically aligned with text prompts and precisely synchronized. While existing joint audio-video generation models often require substantial training resources to improve fidelity, Inference-Time Scaling (ITS) has recently emerged as a promising training-free alternative in single-modality domains.
相关产品查看全部 (10)
相关报道查看全部 (1)
Inference-Time Scaling for Joint Audio-Video Generation
ArXiv CS.CV2026-06-03