Robust Dreamer: Deviation-Aware Latent Gaussian Memory for Action-Controlled AR Video Generation 文章

ArXiv CS.CV2026-06-01NEWSen作者: Hanlin Chen, Jiaxin Wei, Xibin Song, Yifu Wang, Steve Wang, Hongdong Li, Pan Ji, Gim Hee Lee

查看原文 →

关系图谱

摘要

arXiv:2605.30855v1 Announce Type: new Abstract: Frame-wise action-controlled image-to-video generation is a promising paradigm for interactive world simulation, where each control signal should elicit an immediate visual response. However, maintaining visual fidelity and 3D consistency over long autoregressive rollouts remains challenging. Existing 3D-aware methods often suffer from catastrophic drift due to two impediments: information loss from \textit{Latent--RGB Cycling}, where generated latents are repeatedly decoded to RGB and re-encoded for future conditioning, and the training--inference gap induced by the \textit{error-free hypothesis}, where clean training memory fails to match prediction-corrupted inference memory. To address these challenges, we present \textbf{Robust Dreamer}, a memory-augmented framework built around how to design 3D memory and how to use it robustly.

Robust Dreamer: Deviation-Aware Latent Gaussian Memory for Action-Controlled AR Video Generation 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (3)

相关技术查看全部 (2)