Cross-Domain Energy-Guided Diffusion Generation for Off-Dynamics Reinforcement Learning 文章

ArXiv CS.AI2026-05-26NEWSen作者: Yu Yang, Yihong Guo, Anqi Liu, Pan Xu

摘要

arXiv:2605.24810v1 Announce Type: cross Abstract: Off-dynamics offline reinforcement learning seeks to learn a target-domain policy from a large source dataset and a limited target dataset under mismatched transition dynamics. Existing approaches such as reward augmentation and data filtering are constrained to the source dataset and cannot synthesize new target behavior to improve coverage beyond the collected source trajectories. While recent model-based methods attempt to address this by learning target-aware dynamics, the generated experience is constructed only at the transition level, which leads to accumulated errors over long horizons. These limitations necessitate a shift toward trajectory-level generation for off-dynamics offline RL. We propose CEDGE, a Cross-domain Energy-guided Diffusion GEneration framework. CEDGE trains a trajectory diffusion model on source-domain trajectories and adapts the generated samples to the target domain through energy guidance.