AINeutralarXiv – CS AI · 18h ago6/10
🧠
On the Complexity of Offline Reinforcement Learning with $Q^\star$-Approximation and Partial Coverage
Researchers present a theoretical framework for offline reinforcement learning that answers a fundamental open question negatively: Q*-realizability and Bellman completeness alone are insufficient for sample-efficient learning under partial coverage. The work introduces a decision-estimation framework that improves sample complexity bounds for practical algorithms like Conservative Q-Learning and extends theoretical understanding to previously unexplored settings.