AIBullisharXiv – CS AI · 6h ago7/10
🧠
BitsMoE: Efficient Spectral Energy-Guided Bit Allocation for MoE LLM Quantization
BitsMoE introduces a spectral-energy-guided quantization framework for compressing Mixture-of-Experts large language models, achieving significant improvements in the ultra-low-bit regime. The method uses SVD decomposition to intelligently allocate bits across expert weights, delivering 27.83 percentage point accuracy improvements over existing approaches at 2-bit quantization while accelerating inference speed by 1.76× on Qwen models.