AINeutralarXiv β CS AI Β· 14h ago6/10
π§
ConfigSpec: Profiling-Based Configuration Selection for Distributed Edge--Cloud Speculative LLM Serving
ConfigSpec introduces a profiling-based framework for optimizing distributed LLM inference across edge-cloud systems using speculative decoding. The research reveals that no single configuration can simultaneously optimize throughput, cost efficiency, and energy efficiencyβrequiring dynamic, device-aware configuration selection rather than fixed deployments.