y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#sparse-architectures News & Analysis

1 article tagged with #sparse-architectures. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · 6h ago7/10
🧠

DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training

Researchers introduce DTop-p, a dynamic routing mechanism for Mixture-of-Experts (MoE) architectures that adaptively selects experts based on token difficulty while maintaining controlled computational costs. The approach outperforms traditional Top-k routing and fixed Top-p methods by using a Proportional-Integral controller to dynamically adjust probability thresholds, demonstrating consistent improvements across large language models and diffusion transformers.