ViewMask-1-to-3: Multi-View Consistent Image Generation via Multimodal Discrete Diffusion Models 事件

PRODUCT_LAUNCH2026-06-04影响: MEDIUM

ViewMask-1-to-3: Multi-View Consistent Image Generation via Multimodal Discrete Diffusion Models arXiv:2512.14099v3 Announce Type: replace Abstract: Motivated by discrete diffusion's success in language-vision modeling, we explore its potential for multi-view generation, a task dominated by continuous approaches. We introduce ViewMask-1-to-3, formulating multi-view generation as a discrete sequence modeling problem where each viewpoint is represented as visual tokens from MAGVIT-v2. Through dis