AIBullisharXiv – CS AI · 3h ago7/10
🧠
Pruning and Distilling Mixture-of-Experts into Dense Language Models
Researchers present a framework for converting Mixture-of-Experts (MoE) language models into standard dense architectures through expert selection, grouping, and knowledge distillation. The method achieves superior performance compared to traditional dense-to-dense pruning while enabling deployment on memory-constrained systems.