AIBullisharXiv – CS AI · 7h ago7/10
🧠
Lodestar: An Online-Learning LLM Inference Router
Researchers introduce Lodestar, a machine learning-based request routing system that dynamically assigns large language model inference tasks to GPU instances in distributed clusters. The system achieves up to 4.38x improvements in latency metrics compared to existing heuristics by continuously learning optimal routing strategies in real-time.