y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#learning-rates News & Analysis

1 article tagged with #learning-rates. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · 15h ago6/10
🧠

One LR Doesn't Fit All: Heavy-Tail Guided Layerwise Learning Rates for LLMs

Researchers introduce Layerwise Learning Rate (LLR), an adaptive training technique that assigns different learning rates to individual Transformer layers based on Heavy-Tailed Self-Regularization theory. Testing across multiple LLM architectures and scales demonstrates up to 1.5x training speedup and improved generalization, with zero-shot accuracy improvements of 2-3% on billion-parameter models.