minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models 文章

ArXiv CS.CV2026-05-29NEWSen作者: Min Zhao, Hongzhou Zhu, Bokai Yan, Zihan Zhou, Yimin Chen, Wenqiang Sun, Kaiwen Zheng, Guande He, Xiao Yang, Chongxuan Li, Fan Bao, Jun Zhu

查看原文 →

关系图谱

摘要

arXiv:2605.30263v1 Announce Type: new Abstract: Recent video diffusion foundation models have achieved remarkable progress in high-quality video generation, yet turning them into real-time interactive video world models remains challenging. Interactive world models require controllable, causal, and low-latency rollout, which in practice demands a full pipeline spanning data construction, controllable fine-tuning, autoregressive training, few-step distillation, and streaming inference. In this work, we present minWM, a full-stack open-source framework for building real-time interactive video world models. minWM provides an end-to-end pipeline that converts existing bidirectional T2V/TI2V video foundation models into camera-controllable few-step autoregressive world models.

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models 文章

摘要

相关事件查看全部 (2)

相关公司

相关人物

相关产品查看全部 (4)

相关技术查看全部 (2)