AIBullisharXiv – CS AI · 9h ago6/10
🧠
Value-and-Structure Alignment for Routing-Consistent Quantization of Mixture-of-Experts Models
Researchers propose VSRAQ, a quantization technique designed specifically for Mixture-of-Experts models that prevents routing instability during model compression. By preserving expert-selection behavior through value and structure alignment, the method enables efficient deployment of large MoE models without quality degradation.