D\'ej\`a View: Looping Transformers for Multi-View 3D Reconstruction 文章

ArXiv CS.CV2026-05-29NEWSen作者: Alessandro Burzio, Tobias Fischer, Sven Elflein, Qunjie Zhou, Riccardo de Lutio, Jiawei Ren, Jiahui Huang, Shengyu Huang, Marc Pollefeys, Laura Leal-Taix\'e, Zan Gojcic, Haithem Turki

摘要

arXiv:2605.30215v1 Announce Type: new Abstract: Recent feed-forward 3D reconstruction transformers have scaled to over a billion parameters, following the broader trend of increasing model capacity in computer vision. Yet emerging evidence suggests that contiguous transformer layers often behave like repeated applications of similar operations, and multi-view reconstruction transformers refine their predictions progressively across decoder depth. We posit that model depth partially buys iteration, paid for inefficiently in unique parameters, and instead make that iteration explicit in architecture. Our model, D\'ej\`aView, applies a single looped transformer block recurrently to per-view features for K refinement steps.

相关公司

暂无数据

相关人物

暂无数据