AIBullisharXiv – CS AI · 6h ago7/10
🧠
Effective Reinforcement Learning for Agentic Search by Recycling Zero-Variance Queries During Training
Researchers propose a query recycling technique for training large language model search agents that dramatically improves efficiency by reusing initially non-informative training examples as the model evolves. A 1.7B parameter model trained with this method achieves performance comparable to much larger 7B parameter systems, suggesting significant computational savings in AI training.