Every9D-21M: Large-Scale Real-World 9D Canonicalization of Everyday Objects 文章

ArXiv CS.CV2026-05-28NEWSen作者: Leonhard Sommer, Emil Akopyan, Adam Kortylewski

摘要

arXiv:2605.28270v1 Announce Type: new Abstract: Estimating the 9D pose of everyday objects from a single real-world image remains challenging. This is largely due to the lack of large-scale supervision. Most existing datasets either rely heavily on synthetic renderings or provide limited coverage of real-world objects: the largest real-world 9D pose dataset to date contains only 17K annotated objects across 9 categories. We address this gap with Every9D-21M, a dataset of 9D pose annotations for 21.8M real-world images from 109K object- centric videos spanning 700 everyday object categories - two orders of magnitude larger than prior real-world 9D pose benchmarks in both image and category count. To achieve this scale, we leverage object-centric videos by reconstructing object- level point clouds via multi-view geometry and aligning similar instances into a shared canonical coordinate frame. Canonical poses are manually annotated for only a small set of reference objects (fewer than 0.

Every9D-21M: Large-Scale Real-World 9D Canonicalization of Everyday Objects 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (1)