AIBullisharXiv – CS AI · Apr 207/10
🧠
Towards Understanding, Analyzing, and Optimizing Agentic AI Execution: A CPU-Centric Perspective
Researchers present a CPU-centric analysis of agentic AI systems, identifying bottlenecks in heterogeneous CPU-GPU architectures where most orchestration occurs on CPU. Two optimization methods—CPU-Aware Overlapped Micro-Batching and Mixed Agentic Scheduling—demonstrate significant latency reductions, addressing a critical infrastructure gap as agentic AI moves toward production deployment.