AINeutralarXiv โ CS AI ยท 5h ago
๐ง
IPD: Boosting Sequential Policy with Imaginary Planning Distillation in Offline Reinforcement Learning
Researchers propose Imaginary Planning Distillation (IPD), a novel framework that enhances offline reinforcement learning by incorporating planning into sequential policy models. IPD uses world models and Model Predictive Control to generate optimal rollouts, training Transformer-based policies that significantly outperform existing methods on D4RL benchmarks.