GEM-4D: Geometry-Enhanced Video World Models for Robot Manipulation 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

GEM-4D: Geometry-Enhanced Video World Models for Robot Manipulation arXiv:2605.22882v2 Announce Type: replace Abstract: Video world models can generate realistic futures from a single instruction, but they often fail to track the same physical points consistently across time. As a result, the generated videos appear plausible, yet lack the physical grounding required for reliable action execution, such as robot manipulation. We present GEM-4D, a geometry-grounded video world model that resolves