Researchers introduce Continual Model Routing (CMR), a framework addressing the challenge of efficiently selecting from thousands of pre-trained models in expanding AI hubs. They present CMRBench, a large-scale benchmark with over 2,000 candidate models, and CARvE, a contrastive embedding method that outperforms existing routing strategies as model repositories grow.
The rapid expansion of AI model hubs creates a critical infrastructure problem: as repositories scale to thousands of pre-trained models, traditional selection mechanisms break down. This research formalizes the continual model routing problem, recognizing that static routing strategies cannot adapt as new models and tasks continuously enter production systems. The CMRBench benchmark provides the first large-scale evaluation environment simulating realistic hub growth, addressing a gap in existing research that typically assumes fixed model sets.
CARvE's approach leverages contrastive learning with checkpoint-based anchoring to maintain efficient routing despite constant changes. Rather than retraining entire systems when models are added, this method uses structured replay to preserve knowledge of previously learned routing patterns while incorporating new information. The empirical results demonstrate substantial improvements over zero-shot retrieval and fine-tuning approaches, suggesting that embedding-based methods are fundamentally better suited to dynamic environments than traditional alternatives.
For the AI infrastructure industry, this work has immediate practical implications. Organizations deploying mixture-of-experts systems face real costs from suboptimal model selection at scale. CARvE's efficiency gains directly reduce computational overhead in inference pipelines. The benchmark itself becomes a crucial tool for practitioners evaluating routing strategies before deployment. Looking forward, this research establishes continual model routing as an active research area, likely spurring development of competing approaches and integration into model hub platforms like Hugging Face.
- βCMRBench provides the first large-scale benchmark with 2,000+ models for evaluating continual routing in expanding AI hubs.
- βCARvE's contrastive embedding approach significantly outperforms zero-shot and fine-tuning baselines on dynamic model selection tasks.
- βContinual model routing addresses a practical infrastructure challenge as AI model repositories grow exponentially.
- βThe method uses checkpoint-based anchoring and structured replay to efficiently adapt routing without full retraining.
- βThis research formalizes model selection as a continual learning problem rather than a static optimization task.