y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#neural-scaling-laws News & Analysis

5 articles tagged with #neural-scaling-laws. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles
AIBullisharXiv – CS AI · May 277/10
🧠

Unified Neural Scaling Laws

Researchers have developed a Unified Neural Scaling Law (UNSL) that accurately models how deep neural networks perform as multiple training and architectural dimensions vary simultaneously. This functional form outperforms existing scaling models across vision, language, math, and reinforcement learning tasks, enabling more precise extrapolation of neural network behavior at scale.

AINeutralarXiv – CS AI · Jun 56/10
🧠

Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

Researchers present a theoretical framework analyzing scaling laws for shallow neural networks in the feature learning regime, deriving phase diagrams that connect sample complexity and weight decay to risk exponents. The work bridges empirical observations in deep learning with rigorous mathematical analysis, establishing links between weight spectrum properties and generalization performance through matrix compressed sensing and LASSO theory.

AINeutralarXiv – CS AI · Jun 26/10
🧠

Inverse Depth Scaling From Most Layers Being Similar

Researchers analyzing large language models find that loss scales inversely with network depth, suggesting most layers function similarly and reduce error through ensemble averaging rather than compositional learning. This inefficient scaling pattern may stem from architectural constraints in residual networks, indicating that improving LLM efficiency requires fundamental architectural innovations rather than simply adding more layers.

AINeutralarXiv – CS AI · May 296/10
🧠

On the Optimizer Dependence of Neural Scaling Laws

Researchers demonstrate that the scaling exponent in neural scaling laws varies systematically based on optimizer choice, with preconditioned optimizers achieving 2.6x larger exponents than standard gradient descent in controlled experiments. The findings suggest scaling-law forecasts must account for optimizer selection, though the practical impact on large-scale LLM training remains uncertain.

AINeutralarXiv – CS AI · May 46/10
🧠

The Quantization Trap: Breaking Linear Scaling Laws in Multi-Hop Reasoning

Researchers demonstrate that quantization—reducing AI model precision to improve efficiency—paradoxically increases energy consumption and degrades reasoning accuracy in multi-hop reasoning tasks, contradicting established neural scaling laws. The study identifies hardware dequantization overhead as a critical bottleneck and proposes a Critical Model Scale metric to predict when quantization becomes counterproductive across different model sizes and hardware configurations.