AIBullisharXiv โ CS AI ยท 6h ago0
๐ง
DynaMoE: Dynamic Token-Level Expert Activation with Layer-Wise Adaptive Capacity for Mixture-of-Experts Neural Networks
Researchers introduce DynaMoE, a new Mixture-of-Experts framework that dynamically activates experts based on input complexity and uses adaptive capacity allocation across network layers. The system achieves superior parameter efficiency compared to static baselines and demonstrates that optimal expert scheduling strategies vary by task type and model scale.