Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications 文章

ArXiv CS.AI2026-06-02NEWSen作者: Vignesh Subramanian, Subhajit Roy, Suguman Bansal

摘要

arXiv:2606.00838v1 Announce Type: new Abstract: Inductive generalization is a framework for reinforcement learning (RL) generalization in which inductively related task instances admit inductively related policies. Prior work captures this structure via a higher-order policy-evolution function learned directly with RL, but suffers from poor training scalability: as training tasks grow, aggregated reward feedback becomes noisy and conflicting, destabilizing training and weakening generalization. We propose DIBS, a decoupled behavioral cloning approach that separates learning task-specific policies from learning the evolution function. We first learn individual teacher policies per task via standard RL, then fit the evolution function via behavioral cloning on teacher-labeled state-action pairs. This replaces noisy reward aggregation with dense, stable supervision.

Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (2)