CounterFlow: A Two-Phase Inference-Time Sampling for Counterfactual Video Foley Generation 事件

BREAKTHROUGH2026-05-26影响: HIGH

CounterFlow: A Two-Phase Inference-Time Sampling for Counterfactual Video Foley Generation arXiv:2605.18916v2 Announce Type: replace-cross Abstract: We investigate Counterfactual Video Foley Generation, which aims to adopt a sound-source identity that contradicts the visual evidence while remaining temporally synchronized to a silent video. Existing Video&Text-to-Audio (VT2A) models struggle with this, often remaining anchored to the visually implied sound source when video and text contents di

CounterFlow: A Two-Phase Inference-Time Sampling for Counterfactual Video Foley Generation · 相关技术