HALO: Learning Human-Robot Collaboration via Heterogeneous-Agent Lyapunov Policy Optimization 文章

ArXiv CS.AI2026-06-02NEWSen作者: Hao Zhang, Yaru Niu, Yikai Wang, Ding Zhao, H. Eric Tseng

详细信息

来源站点: ArXiv CS.AI
作者: Hao Zhang, Yaru Niu, Yikai Wang, Ding Zhao, H. Eric Tseng
文章类型: NEWS
语言: en
发布日期: 2026-06-02

摘要

arXiv:2603.03741v2 Announce Type: replace-cross Abstract: To improve generalization and resilience in human-robot collaboration (HRC), robots must contend with diverse combinations of human behaviors and contexts, motivating multi-agent reinforcement learning (MARL). However, inherent heterogeneity between robots and humans creates a rationality gap (RG), where decentralized policy updates deviate from cooperative joint optimization. The resulting learning problem is a general-sum differentiable game, so independent policy-gradient updates can oscillate or diverge without added structure. We propose heterogeneous-agent Lyapunov policy optimization (HALO), a framework that stabilizes decentralized MARL by enforcing Lyapunov-based contraction in policy-parameter space. Unlike Lyapunov-based safe RL, which targets state/trajectory constraints in constrained Markov decision processes, HALO uses Lyapunov certification to stabilize decentralized policy learning.

HALO: Learning Human-Robot Collaboration via Heterogeneous-Agent Lyapunov Policy Optimization 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (1)