#autonomous-improvement News & Analysis

5 articles tagged with #autonomous-improvement. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles

AIBullisharXiv – CS AI · Jun 57/10

🧠

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Researchers introduce Retrospective Harness Optimization (RHO), a self-supervised method that enables AI agents to improve their capabilities using only historical trajectory data without requiring external validation sets. The approach improved performance on SWE-Bench Pro from 59% to 78% pass rate in a single optimization round, demonstrating practical effectiveness across software engineering, technical work, and knowledge domains.

AIBullisharXiv – CS AI · May 287/10

🧠

You Live More Than Once: Towards Hierarchical Skill Meta-Evolving

Researchers propose HiSME, a hierarchical skill meta-evolving framework that enables AI agents to continuously improve both their skills and the strategies used to evolve those skills at test-time, without expensive model parameter updates. The approach learns meta-skills from task execution traces and demonstrates higher-quality skill libraries compared to static skill evolving approaches.

AIBullisharXiv – CS AI · May 127/10

🧠

SkillEvolver: Skill Learning as a Meta-Skill

SkillEvolver introduces a meta-learning framework that automatically improves AI agent skills through iterative refinement based on real-world deployment failures, achieving 56.8% accuracy on benchmark tasks compared to 43.6% for manually curated skills. The system learns by modifying skill prose and code rather than model weights, enabling seamless integration with any compatible agent without retraining.

AIBullisharXiv – CS AI · May 127/10

🧠

G-Zero: Self-Play for Open-Ended Generation from Zero Data

Researchers introduce G-Zero, a verifier-free framework that enables large language models to improve autonomously through self-play without relying on external judges or proxy models. The approach uses an intrinsic reward mechanism called Hint-δ to identify and address the Generator model's blind spots, achieving scalable self-evolution across unverifiable domains.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Regimes: An Auditable, Held-Out-Gated Improvement Loop Demonstrated on LongMemEval with ActiveGraph

Researchers introduce Regimes, an auditable autonomous improvement loop built on the ActiveGraph event-sourced runtime that enables transparent, reproducible AI agent optimization. The system diagnoses failures, proposes repairs, and validates them through multiple gates before promotion, demonstrating 5-10% held-out accuracy improvements on long-context reading comprehension tasks.