AINeutralarXiv โ CS AI ยท 5d ago4/104
๐ง
Sample-efficient and Scalable Exploration in Continuous-Time RL
Researchers introduce COMBRL, a new reinforcement learning algorithm designed for continuous-time systems using nonlinear ordinary differential equations. The algorithm achieves sublinear regret and better sample efficiency compared to existing methods by combining probabilistic models with uncertainty-aware exploration.