AINeutralarXiv – CS AI · 7h ago6/10
🧠
Redesign Mixture-of-Experts Routers with Manifold Power Iteration
Researchers propose Manifold Power Iteration (MPI), a novel router redesign method for Mixture-of-Experts models that aligns router rows with principal singular directions of associated experts. The approach uses a "Power-then-Retract" paradigm and demonstrates improved MoE model effectiveness across scales from 1B to 11B parameters.