Learning Predictive Visuomotor Coordination 文章

ArXiv CS.CV2026-06-05NEWSen作者: Wenqi Jia, Bolin Lai, Miao Liu, Danfei Xu, James M. Rehg

摘要

arXiv:2503.23300v2 Announce Type: replace Abstract: Understanding and predicting human visuomotor coordination is crucial for applications in robotics, human-computer interaction, and assistive technologies. This work introduces a forecasting-based task for visuomotor modeling, where the goal is to predict head pose, gaze, and upper-body motion from egocentric visual and kinematic observations. We propose a \textit{Visuomotor Coordination Representation} (VCR) that learns structured temporal dependencies across these multimodal signals. We extend a diffusion-based motion modeling framework that integrates egocentric vision and kinematic sequences, enabling temporally coherent and accurate visuomotor predictions. Our approach is evaluated on the large-scale EgoExo4D dataset, demonstrating strong generalization across diverse real-world activities.

Learning Predictive Visuomotor Coordination 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (4)

相关技术查看全部 (2)