AIBullisharXiv โ CS AI ยท 6d ago7/102
๐ง
ButterflyMoE: Sub-Linear Ternary Experts via Structured Butterfly Orbits
ButterflyMoE introduces a breakthrough approach to reduce memory requirements for AI expert models by 150ร through geometric parameterization instead of storing independent weight matrices. The method uses shared ternary prototypes with learned rotations to achieve sub-linear memory scaling, enabling deployment of multiple experts on edge devices.