AIBullisharXiv – CS AI · Mar 37/102
🧠
ButterflyMoE: Sub-Linear Ternary Experts via Structured Butterfly Orbits
ButterflyMoE introduces a breakthrough approach to reduce memory requirements for AI expert models by 150× through geometric parameterization instead of storing independent weight matrices. The method uses shared ternary prototypes with learned rotations to achieve sub-linear memory scaling, enabling deployment of multiple experts on edge devices.