AdvantageFlow: Advantage-Weighted Least Squares for RL in Flow Models 事件

Name: AdvantageFlow: Advantage-Weighted Least Squares for RL in Flow Models
Start: 2026-05-26

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

AdvantageFlow: Advantage-Weighted Least Squares for RL in Flow Models arXiv:2605.26013v1 Announce Type: cross Abstract: We introduce AdvantageFlow, a forward-process reinforcement learning algorithm for rectified flow models. Unlike Flow-GRPO, which optimizes the reverse process, we optimize an advantage-weighted forward-process prediction loss. This optimization problem is unstable when advantages are negative and the loss becomes non-convex. We stabilize it by rollout policy regularization, w

人工智能

关系图谱

AdvantageFlow: Advantage-Weighted Least Squares for RL in Flow Models 事件

相关公司查看全部 (9)

相关人物

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)