Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization 事件
PRODUCT_LAUNCH2026-06-04影响: MEDIUM
Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization arXiv:2605.15980v2 Announce Type: replace Abstract: Group Relative Policy Optimization has emerged as essential for aligning video diffusion models with human preferences, but faces a critical computational bottleneck: training a 14B parametered model typically demands hundreds of GPU days per experiment. Existing efficiency methods reduce costs through sliding window subsampling training timesteps, but fundame
相关报道查看全部 (1)
Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization
ArXiv CS.CV2026-06-04