Unsupervised Monocular 3D Keypoint Discovery from Multi-View Diffusion Priors 文章

ArXiv CS.CV2026-06-05NEWSen作者: Subin Jeon, In Cho, Junyoung Hong, Woong Oh Cho, Seon Joo Kim

摘要

arXiv:2507.12336v2 Announce Type: replace Abstract: Most existing 3D keypoint estimation methods rely on manual annotations or calibrated multi-view images, both of which are expensive to collect. This paper introduces KeyDiff3D, a framework that can accurately predict 3D keypoints from a single image, thus eliminating the need for such expensive data acquisitions. To achieve this, we leverage powerful geometric priors embedded in a pretrained multi-view diffusion model. In our framework, the diffusion model generates multi-view images from a single image, serving as supervision signals to provide 3D geometric cues to our model. We also introduce a 3D feature extractor that transforms implicit 3D priors embedded in the diffusion features into explicit 3D feature volumes. Beyond accurate keypoint estimation, we further introduce a pipeline that enables manipulation of 3D objects generated by the diffusion model. Experimental results on diverse datasets, including Human3.

Unsupervised Monocular 3D Keypoint Discovery from Multi-View Diffusion Priors 文章

摘要

相关事件查看全部 (2)

相关公司

相关人物

相关产品查看全部 (6)

相关技术查看全部 (3)