AINeutralarXiv โ CS AI ยท 14h ago7/10
๐ง
The Myth of Expert Specialization in MoEs: Why Routing Reflects Geometry, Not Necessarily Domain Expertise
Researchers demonstrate that Mixture of Experts (MoEs) specialization in large language models emerges from hidden state geometry rather than specialized routing architecture, challenging assumptions about how these systems work. Expert routing patterns resist human interpretation across models and tasks, suggesting that understanding MoE specialization remains as difficult as the broader unsolved problem of interpreting LLM internal representations.