MilliVid: Hierarchical Latents for Long-Range Consistency in Video Generation 事件

PRODUCT_LAUNCH2026-06-09影响: MEDIUM

MilliVid: Hierarchical Latents for Long-Range Consistency in Video Generation arXiv:2606.09056v1 Announce Type: new Abstract: Video generative models have become increasingly powerful, but long-range consistency remains challenging to achieve because even a few dozen frames require impractically long transformer sequence lengths. We show that this issue can be mitigated by generating video using coarse-to-fine rollout within a multi-scale token space. Our approach is simple: first, we pre-train