Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning 事件

Name: Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning
Start: 2026-06-10

PRODUCT_LAUNCH2026-06-10影响: MEDIUM

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning arXiv:2606.11087v1 Announce Type: cross Abstract: Expressive continuous control policies, such as diffusion and flow models, form the backbone of recent advances in scaling imitation learning for simulated and real robot control. While they are known to scale stably in the supervised imitation learning setting, incorporating them into reinforcement learning (RL) pipelines for policy improvement has proven more difficult. It

人工智能

关系图谱

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning 事件

相关公司查看全部 (10)

相关人物查看全部 (1)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)