G3T Up! Gravity Aligned Coordinate Frames Simplify Pointmap Processing 文章

ArXiv CS.CV2026-05-27NEWSen作者: Bharath Raj Nagoor Kani, Noah Snavely

摘要

arXiv:2605.27372v1 Announce Type: new Abstract: Modern feed-forward 3D reconstruction methods like VGGT predict pixel-aligned pointmaps in camera-centric coordinate frames. However, this choice of coordinate frame is not always optimal. We propose instead to predict pointmaps in upright, gravity-aligned frames that exploit strong structural cues present in many real-world scenes. Unlike camera-centric frames, gravity-aligned frames share a common vertical axis across viewpoints, reducing the rotational degrees of freedom needed to relate pointmaps to one another. To this end, we introduce the Gravity Grounded Geometry Transformer (G3T), fine-tuned from existing models on gravity-aligned 3D data. G3T produces highly accurate gravity-aware predictions, including upright pointmaps and camera-to-gravity poses.