AFUN: Towards an Affordance Foundation Model for Functionality Understanding 文章

ArXiv CS.CV2026-06-02NEWSen作者: Zhaoning Wang, Yi Zhong, Jiawei Fu, Henrik I. Christensen, Jun Gao

摘要

arXiv:2606.02551v1 Announce Type: cross Abstract: Affordance understanding bridges visual perception and physical action, serving as an explainable interface for robot manipulation in open and unstructured real-world environments. Yet, building an affordance foundation model that not only understands where and how the interaction should happen, but also generalizes across diverse environments, objects, and tasks, remains a long-standing research challenge. Existing methods typically address only part of this challenge, either localizing task-relevant regions without specifying executable motion, or predicting motion but with limited scalability. In this paper, we present ourmodel, a step towards an affordance foundation model for functionality understanding. From a single RGB-D observation and a language task description, ourmodel predicts a task-conditional functional mask (where to interact) and a 3D post-contact motion curve (how to interact).

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据

相关技术

暂无数据