Agent-as-a-Router: Agentic Model Routing for Coding Tasks
Researchers propose Agent-as-a-Router, a framework that dynamically routes coding tasks to the most suitable LLM among multiple providers by accumulating execution-grounded experience during deployment. The approach, instantiated as ACRouter, demonstrates 15.3% performance gains over static routers and introduces CodeRouterBench, a benchmark with ~10K tasks from 8 frontier LLMs, addressing the critical need for intelligent model selection in multi-provider environments.
The proliferation of specialized large language models has created a practical problem: determining which model to use for which task. While existing routers treat this as a static classification challenge, researchers have identified the core limitation as information deficit—routers lack sufficient execution data to make optimal decisions. This research advances the field by treating routing as a dynamic process where feedback from actual task execution continuously improves routing decisions.
The 15.3% relative performance gain from dimension-level performance statistics reveals that even simple augmentation of baseline routers with historical data yields significant improvements. This finding establishes a baseline for understanding what performance gaps exist and where agentic approaches can add value. The C-A-F loop (Context-Action-Feedback-Context) formalization treats routing not as a one-time decision but as an ongoing optimization problem, mirroring how human developers choose tools based on accumulated experience.
For practitioners deploying multiple LLMs across different use cases, this framework addresses real operational challenges: cost optimization and performance maximization. Organizations using models from different providers can now leverage execution history to make smarter allocation decisions without manual configuration. The CodeRouterBench benchmark enables standardized comparison of routing strategies, moving the field from anecdotal evidence to empirical measurement.
The generalization to out-of-distribution tasks suggests the framework captures something fundamental about model strengths rather than overfitting to specific task patterns. Future work likely involves extending this to non-coding domains and integrating emerging models into routing decisions without full retraining.
- →Agent-as-a-Router achieves 15.3% relative performance improvement over static routers using dimension-level performance statistics.
- →The C-A-F loop framework transforms routing from static classification into a dynamic, experience-accumulating optimization process.
- →CodeRouterBench provides standardized evaluation across 8 frontier LLMs with ~10K task instances for regret-based router comparison.
- →ACRouter successfully generalizes to out-of-distribution tasks, demonstrating the framework closes information gaps through execution-grounded learning.
- →Multi-provider LLM deployments can reduce costs and improve performance through intelligent task-to-model routing based on accumulated execution data.