y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#sparse-training News & Analysis

2 articles tagged with #sparse-training. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AIBullisharXiv – CS AI · 7h ago7/10
🧠

Memory-Efficient LLM Training with Dynamic Sparsity: From Stability to Practical Scaling

Researchers propose Sparse Memory-Efficient Training (SMET), a method that stabilizes Dynamic Sparse Training for large language models by addressing optimization instability through optimizer warm-up and density-aware learning-rate scaling. The approach reduces memory consumption while maintaining training stability, offering a practical alternative to dense model training.

AIBullisharXiv – CS AI · 7h ago7/10
🧠

HASTE: Hardware-Aware Dynamic Sparse Training for Large Output Spaces

Researchers introduce HASTE, a hardware-aware sparse training method for extreme multi-label classification that uses group-shared fixed fan-in sparsity to optimize GPU execution. The approach achieves up to 25x speedup in backward passes compared to standard sparse methods while maintaining competitive accuracy, addressing the memory-compute bottleneck in models with millions of output labels.