AIBullisharXiv – CS AI · 18h ago6/10
🧠
AGENTSERVESIM: A Hardware-aware Simulator for Multi-Turn LLM Agent Serving
Researchers introduce AGENTSERVESIM, a hardware-aware simulator designed to evaluate serving policies for multi-turn LLM agents without requiring expensive accelerator deployments. The simulator accurately reproduces real-system performance within 6% error while running on standard CPUs, enabling scalable exploration of agent-serving policies across different hardware configurations and workload scenarios.