MIND: Multi-Scale Intent Diffusion for Text-Driven Physics-Based Humanoid Control 文章

ArXiv CS.CV2026-05-26NEWSen作者: Bin Li, Ruichi Zhang, Han Liang, Jingyan Zhang, Juze Zhang, Xin Chen, Jingya Wang

摘要

arXiv:2605.26006v1 Announce Type: new Abstract: Enabling physics-based humanoids to execute diverse behaviors from high-level textual commands remains a significant challenge. Existing methods typically follow either a two-stage paradigm that combines kinematic motion generation with physics-based tracking, or an end-to-end imitation-learning paradigm that directly generates actions from text. However, the former suffers from the inherent domain shift between kinematic generation and physics-based tracking, while the latter struggles with the substantial modality gap between textual commands and low-level actions, limiting effective semantic alignment. Notably, humanoid states encode rich motion dynamics that are more semantically aligned with textual descriptions than low-level actions, making them a natural basis for deriving behavioral intent.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据

相关技术

暂无数据