Do Joint Audio-Video Generation Models Understand Physics? 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Do Joint Audio-Video Generation Models Understand Physics? arXiv:2605.07061v2 Announce Type: replace-cross Abstract: Joint audio-video generation models are rapidly approaching professional production quality, raising a central question: do they understand audio-visual physics, or merely generate plausible sounds and frames that violate real-world consistency? We introduce AV-Phys Bench, a benchmark for evaluating physical commonsense in joint audio-video generation. AV-Phys Bench tests models