Diffusion-Augmented Markov Decision Processes for Maximum Entropy Reinforcement Learning 事件
PRODUCT_LAUNCH2026-05-28影响: MEDIUM
Diffusion-Augmented Markov Decision Processes for Maximum Entropy Reinforcement Learning arXiv:2512.02019v3 Announce Type: replace-cross Abstract: Diffusion models excel at sampling from complex, unnormalized distributions. In this work, we extend Maximum Entropy Reinforcement Learning (ME-RL) to diffusion processes, enabling sampling from the optimal policy trajectory distribution. By minimizing a tractable upper bound on the reverse KL divergence between the diffusion policy and the optimal p