🧠 AI🟢 BullishImportance 7/10

SAGE: Multi-Agent Self-Evolution for LLM Reasoning

arXiv – CS AI|Yulin Peng, Xinxin Zhu, Chenxing Wei, Nianbo Zeng, Leilei Wang, Ying Tiffany He, F. Richard Yu|March 17, 2026 at 04:00 AM

🤖AI Summary

Researchers introduced SAGE, a multi-agent framework that improves large language model reasoning through self-evolution using four specialized agents. The system achieved significant performance gains on coding and mathematics benchmarks without requiring large human-labeled datasets.

Key Takeaways

→SAGE uses four co-evolving agents (Challenger, Planner, Solver, Critic) to improve LLM reasoning capabilities through self-play.
→The framework reduces dependency on large human-labeled datasets by using only a small seed set for training.
→SAGE improved Qwen-2.5-7B model performance by 8.9% on LiveCodeBench and 10.7% on OlympiadBench.
→The Critic agent prevents curriculum drift and maintains training quality through scoring and filtering mechanisms.
→The approach shows consistent gains across different model scales in mathematics and code generation tasks.