AIBullisharXiv – CS AI · 14h ago7/10
🧠
Offline Reinforcement Learning with Generative Trajectory Policies
Researchers propose Generative Trajectory Policies (GTPs), a unified framework for offline reinforcement learning that bridges the performance gap between slow diffusion models and fast consistency policies by learning continuous-time generative trajectories. The approach achieves state-of-the-art results on D4RL benchmarks, including perfect scores on difficult AntMaze tasks.