AINeutralarXiv – CS AI · 15h ago6/10
🧠
Efficient On-policy Visual-RL via Stochastic Decoupled Policy Gradient
Researchers introduce SDPG, a visual reinforcement learning method that trains robotic control policies significantly faster and more efficiently on consumer GPUs. The approach reduces computational overhead through stochastic gradient estimation while maintaining superior performance, and includes new benchmarks for advancing visual robotics research.
🏢 Nvidia