AIBullisharXiv – CS AI · 15h ago7/10
🧠
Search-E1: Self-Distillation Drives Self-Evolution in Search-Augmented Reasoning
Search-E1 introduces a simplified self-evolution method for search-augmented reasoning agents that achieves competitive performance through vanilla GRPO and self-distillation, without external supervision or complex auxiliary systems. The approach reaches 0.440 average EM on QA benchmarks with Qwen2.5-3B, demonstrating that elaborate post-training machinery may be unnecessary for effective agent development.