When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning arXiv:2602.08236v2 Announce Type: replace Abstract: Despite rapid progress in MLLMs, visual spatial reasoning remains unreliable when correct answers depend on how a scene would appear under unseen or alternative viewpoints. Recent work addresses this by augmenting reasoning with world models for visual imagination, but questions such as when imagination is actually necessary, how much of it