AIBullisharXiv โ CS AI ยท 11h ago7/10
๐ง
Watt Counts: Energy-Aware Benchmark for Sustainable LLM Inference on Heterogeneous GPU Architectures
Researchers introduced Watt Counts, an open-access dataset containing over 5,000 energy consumption experiments across 50 LLMs and 10 NVIDIA GPUs, revealing that optimal hardware choices for energy-efficient inference vary significantly by model and deployment scenario. The study demonstrates practitioners can reduce energy consumption by up to 70% in server deployments with minimal performance impact, addressing a critical gap in energy-aware LLM deployment guidance.
๐ข Nvidia