AIBullisharXiv โ CS AI ยท 3d ago7/10
๐ง
From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents
Researchers developed EigenData, a framework combining self-evolving synthetic data generation with reinforcement learning to train AI agents for multi-turn tool usage and dialogue. The system achieved 73% success on Airline tasks and 98.3% on Telecom benchmarks, matching frontier models while eliminating the need for expensive human annotation.