AIBullisharXiv – CS AI · 9h ago7/10
🧠
Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts
Researchers introduce Retrospective Harness Optimization (RHO), a self-supervised method that enables AI agents to improve their capabilities using only historical trajectory data without requiring external validation sets. The approach improved performance on SWE-Bench Pro from 59% to 78% pass rate in a single optimization round, demonstrating practical effectiveness across software engineering, technical work, and knowledge domains.