AdaState: Self-Evolving Anchors for Streaming Video Generation 文章

ArXiv CS.CV2026-05-29NEWSen作者: Yusuf Dalva, Pinar Yanardag

摘要

arXiv:2605.30349v1 Announce Type: new Abstract: Autoregressive video diffusion models generate streaming video by producing frames sequentially, conditioning each chunk on previously generated content. These models are structurally anchored to the first frame: its key-value representation occupies a privileged position in the attention cache and serves as the primary scene reference throughout generation. As the cleanest and most error-free position in the cache, this anchor draws disproportionate attention, suppressing video dynamics, and locking scene composition to the initial viewpoint even as the scene naturally evolves. The result is a temporally shallow video in which motion, camera movement, and scene progression are dampened in favor of static consistency. To address this, we replace the static anchor with an adaptive state, a hidden latent that the model denoises alongside content at every chunk but never renders.

相关事件查看全部 (1)

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据

相关技术

暂无数据