PRO-CUA: Process-Reward Optimization for Computer Use Agents
Researchers introduce PRO-CUA, a reinforcement learning framework that improves training of computer use agents (AI systems that automate digital workflows) by using step-level process rewards instead of trajectory-level feedback. The method reduces training costs and distribution shift while achieving better performance on live web benchmarks.