AINeutralarXiv – CS AI · 3h ago6/10
🧠
SPAR: Support-Preserving Action Rectification
Researchers introduce SPAR (Support-Preserving Action Rectification), a new offline reinforcement learning method that addresses the fundamental tension between maximizing value and staying true to training data. By anchoring policy improvements to frozen behavior cloning and operating in residual space, SPAR achieves state-of-the-art results on D4RL benchmarks while maintaining data distribution fidelity.