Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications 文章

ArXiv CS.AI2026-06-02NEWSen作者: Vignesh Subramanian, Subhajit Roy, Suguman Bansal

摘要

arXiv:2606.00838v1 Announce Type: new Abstract: Inductive generalization is a framework for reinforcement learning (RL) generalization in which inductively related task instances admit inductively related policies. Prior work captures this structure via a higher-order policy-evolution function learned directly with RL, but suffers from poor training scalability: as training tasks grow, aggregated reward feedback becomes noisy and conflicting, destabilizing training and weakening generalization. We propose DIBS, a decoupled behavioral cloning approach that separates learning task-specific policies from learning the evolution function. We first learn individual teacher policies per task via standard RL, then fit the evolution function via behavioral cloning on teacher-labeled state-action pairs. This replaces noisy reward aggregation with dense, stable supervision.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据