Xiaomi Auto World Model: A Joint World Model Integrating Reconstruction and Generation for Autonomous Driving 文章
摘要
arXiv:2605.18137v3 Announce Type: replace Abstract: This report presents a unified technical system addressing the two core capabilities of world models for autonomous driving: world representation and world generation. For world representation, we propose WorldRec, a feed-forward reconstruction architecture driven by sparse scene queries. WorldRec initializes structured queries in 3D space, leveraging them to aggregate cross-view, cross-temporal features, thereby naturally enforcing spatial consistency across frames and yielding compact yet high-fidelity 3D Gaussian scene representations. For world generation, we propose WorldGen, a two-stage training framework of bidirectional pretraining followed by causal fine-tuning through three progressive stages (Teacher Forcing, ODE distillation, and DMD), enabling high-quality online causal video generation in as few as 4 denoising steps.
相关事件查看全部 (1)
相关公司查看全部 (3)
相关人物
暂无数据