HERO: Learning Humanoid End-Effector Control for Visual Whole-Body Open-Vocabulary Object Grasping 文章

ArXiv CS.CV2026-06-05NEWSen作者: Runpei Dong, Ziyan Li, Arjun Gupta, Xialin He, Saurabh Gupta

详细信息

来源站点: ArXiv CS.CV
作者: Runpei Dong, Ziyan Li, Arjun Gupta, Xialin He, Saurabh Gupta
文章类型: NEWS
语言: en
发布日期: 2026-06-05

摘要

arXiv:2602.16705v3 Announce Type: replace-cross Abstract: Visual loco-manipulation of arbitrary in-the-wild objects requires accurate end-effector (EE) control and a generalizable understanding of the scene from visual inputs (eg, RGB-D images). Existing imitation and sim2real methods jointly learn both these aspects via monolithic end-to-end learning and are thus hard to scale. In this work, we bring to bear the best tools for each of these problems -- large vision models for generalizable scene understanding and simulated training for accurate EE control -- leading to an overall modular loco-manipulation system that exhibits strong generalization. Our core technical innovation is HERO, an accurate residual-aware EE tracking policy made possible by combining classical robotics with machine learning.

HERO: Learning Humanoid End-Effector Control for Visual Whole-Body Open-Vocabulary Object Grasping 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (1)