#critic-models News & Analysis

3 articles tagged with #critic-models. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles

AIBullisharXiv – CS AI · Jun 237/10

🧠

Steer, Don't Solve: Training Small Critic Models for Large Code Agents

Researchers developed a small critic model that guides large code agents during execution rather than evaluating completed work, reducing computational costs while improving performance. The approach achieves 25.2% accuracy on SWE-bench Verified at 64% lower expense than larger agents, demonstrating that supplementing agent training with efficient feedback mechanisms outperforms scaling alone.

🏢 Hugging Face

AIBullisharXiv – CS AI · Mar 56/10

🧠

A Rubric-Supervised Critic from Sparse Real-World Outcomes

Researchers propose a new framework called Critic Rubrics to bridge the gap between academic coding agent benchmarks and real-world applications. The system learns from sparse, noisy human interaction data using 24 behavioral features and shows significant improvements in code generation tasks including 15.9% better reranking performance on SWE-bench.

AINeutralarXiv – CS AI · Apr 156/10

🧠

No More Stale Feedback: Co-Evolving Critics for Open-World Agent Learning

Researchers introduce ECHO, a reinforcement learning framework that co-evolves policy and critic models to address the problem of stale feedback in LLM agent training. The system uses cascaded rollouts and saturation-aware gain shaping to maintain synchronized, relevant critique as the agent's behavior improves over time, demonstrating enhanced stability and success rates in complex environments.