AINeutralarXiv – CS AI · 6h ago6/10
🧠
Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications
Researchers propose DIBS, a decoupled behavioral cloning approach that improves reinforcement learning generalization by separating task-specific policy learning from evolution function learning. The method replaces noisy reward aggregation with stable supervision from teacher policies, achieving better training stability and zero-shot generalization compared to existing RL and meta-RL algorithms.