AIBullisharXiv โ CS AI ยท 4h ago7/10
๐ง
Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference
Researchers analyzed data movement patterns in large-scale Mixture of Experts (MoE) language models (200B-1000B parameters) to optimize inference performance. Their findings led to architectural modifications achieving 6.6x speedups on wafer-scale GPUs and up to 1.25x improvements on existing systems through better expert placement algorithms.
๐ข Hugging Face