AINeutralarXiv β CS AI Β· 5d ago4/104
π§
Sample-efficient and Scalable Exploration in Continuous-Time RL
Researchers introduce COMBRL, a new reinforcement learning algorithm designed for continuous-time systems using nonlinear ordinary differential equations. The algorithm achieves sublinear regret and better sample efficiency compared to existing methods by combining probabilistic models with uncertainty-aware exploration.