AlignVid: Training-Free Attention Scaling for Semantic Fidelity in Text-Guided Image-to-Video Generation 事件

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

AlignVid: Training-Free Attention Scaling for Semantic Fidelity in Text-Guided Image-to-Video Generation arXiv:2512.01334v2 Announce Type: replace Abstract: Text-guided image-to-video generation has made substantial progress, yet it still struggles to execute text-specified edits that require substantial changes to a reference image (\textit{e.g., object addition, removal, or modification}). Empirically, our analysis reveals that this stems from \textbf{visual dominance}, where the reference im