AIBullisharXiv – CS AI · 8h ago7/10
🧠
Dooly: Configuration-Agnostic, Redundancy-Aware Profiling for LLM Inference Simulation
Dooly is a new profiling framework that optimizes LLM inference simulation by reducing redundant profiling across different hardware and software configurations. By leveraging structural insights about operation dependencies, the system cuts profiling costs by over 56% while maintaining simulation accuracy within 5-8% error margins, addressing a critical bottleneck in LLM deployment optimization.