Analytics Digests Sources Topics RSS AI Crypto

#inference-systems News & Analysis

1 article tagged with #inference-systems. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles

AINeutralarXiv – CS AI · Apr 146/10

🧠

Characterizing Performance-Energy Trade-offs of Large Language Models in Multi-Request Workflows

Researchers present the first systematic study of performance-energy trade-offs in multi-request LLM inference workflows, using NVIDIA A100 GPUs and vLLM/Parrot serving systems. The study identifies batch size as the most impactful optimization lever, though effectiveness varies by workload type, and reveals that workflow-aware scheduling can reduce energy consumption under power constraints.

🏢 Nvidia