AIBullisharXiv – CS AI · 14h ago6/10
🧠
ConMoE: Expert-Pool Consolidation via Prototype Reassignment for MoE Compression
ConMoE presents a novel post-training compression method for Mixture-of-Experts language models that consolidates expert pools through prototype reassignment rather than pruning or weight merging. The train-free approach selectively retains pretrained experts as reusable prototypes and remaps original expert references to these prototypes, achieving competitive or superior performance on major MoE models while significantly reducing deployment memory requirements.