SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating
Researchers introduce SlimSearcher, a framework that trains AI web agents to perform complex information-seeking tasks with 17-58% fewer tool calls while maintaining or improving accuracy. The approach combines efficient trajectory filtering during supervised fine-tuning with adaptive reward gating during reinforcement learning to eliminate wasteful search behaviors.