y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#scaling-laws News & Analysis

37 articles tagged with #scaling-laws. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

37 articles
AINeutralarXiv – CS AI · May 126/10
🧠

A Geometric Perspective on Next-Token Prediction in Large Language Models: Three Emerging Phases

Researchers have developed a geometric framework for understanding how large language models process information across their layers, identifying three distinct phases in next-token prediction: Seeding Multiplexing, Hoisting Overriding, and Focal Convergence. The study reveals that model depth primarily increases capacity for candidate disambiguation rather than adding fundamentally new computational stages.

AIBullisharXiv – CS AI · May 116/10
🧠

Knowledge Transfer Scaling Laws for 3D Medical Imaging

Researchers demonstrate that different 3D medical imaging domains (CT, MRI, PET) transfer knowledge asymmetrically during pretraining, following predictable power-law patterns. By optimizing data allocation based on these transfer dynamics, they achieve up to 58% performance gains over proportional sampling, revealing a hub-and-island structure where certain domains act as foundational knowledge sources for others.

AINeutralarXiv – CS AI · May 116/10
🧠

Spectral Dynamics in Deep Networks: Feature Learning, Outlier Escape, and Learning Rate Transfer

Researchers develop a dynamical mean-field theory framework to analyze how neural network weight spectra evolve during training, revealing that different parameterization schemes (μP vs NTK) produce fundamentally different outlier dynamics. The findings suggest that neural scaling laws and hyperparameter transfer depend critically on how outlier eigenvalues behave, with implications for understanding deep learning generalization and optimization.

AINeutralarXiv – CS AI · May 96/10
🧠

Causal Probing for Internal Visual Representations in Multimodal Large Language Models

Researchers developed a causal probing framework to decode how Multimodal Large Language Models internally represent visual concepts, revealing that entities are encoded in localized regions while abstract concepts distribute globally across networks. The findings expose mechanistic drivers of scaling laws and uncover a disconnect between visual perception and reasoning capabilities in MLLMs.

AINeutralarXiv – CS AI · May 96/10
🧠

Continuous Latent Diffusion Language Model

Researchers propose Cola DLM, a hierarchical latent diffusion language model that generates text through continuous semantic modeling rather than traditional left-to-right autoregressive decoding. The approach achieves comparable performance to autoregressive models while offering greater flexibility, better scaling properties, and a potential pathway for unified modeling across discrete and continuous modalities.

AIBearishCrypto Briefing · Apr 116/10
🧠

Ranjan Roy: AI marketing hype often overshadows substance, concerns about AI exploiting software vulnerabilities, and the significance of scaling laws in model performance | Big Technology

Ranjan Roy highlights how AI marketing hype often obscures substantive security concerns, particularly regarding AI systems exploiting software vulnerabilities. The analysis emphasizes the importance of scaling laws in model performance and urges critical evaluation of AI breakthroughs beyond promotional claims.

Ranjan Roy: AI marketing hype often overshadows substance, concerns about AI exploiting software vulnerabilities, and the significance of scaling laws in model performance | Big Technology
AINeutralarXiv – CS AI · Mar 37/107
🧠

What Is the Geometry of the Alignment Tax?

Researchers present a formal geometric theory for quantifying the alignment tax - the tradeoff between AI safety and capability performance. They derive mathematical frameworks showing how safety-capability conflicts can be measured using angles between representation subspaces and provide scaling laws for how these tradeoffs evolve with model size.

AINeutralarXiv – CS AI · Feb 275/105
🧠

Scaling Laws for Precision in High-Dimensional Linear Regression

Researchers developed theoretical scaling laws for low-precision AI model training, analyzing how quantization affects model performance in high-dimensional linear regression. The study reveals that multiplicative and additive quantization schemes have distinct effects on effective model size, with multiplicative maintaining full precision while additive reduces it.

AINeutralGoogle Research Blog · Jan 276/105
🧠

ATLAS: Practical scaling laws for multilingual models

ATLAS presents new scaling laws for multilingual generative AI models, providing practical frameworks for understanding how model performance scales across different languages and model sizes. This research offers valuable insights for optimizing multilingual AI system development and deployment strategies.

AINeutralarXiv – CS AI · Mar 34/104
🧠

Scaling Laws of SignSGD in Linear Regression: When Does It Outperform SGD?

Researchers analyzed scaling laws for signSGD optimization in machine learning, comparing it to standard SGD under a power-law random features model. The study identifies unique effects in signSGD that can lead to steeper compute-optimal scaling laws than SGD in noise-dominant regimes.

AINeutralOpenAI News · Oct 191/107
🧠

Scaling laws for reward model overoptimization

The article appears to discuss scaling laws related to reward model overoptimization in AI systems. However, the article body is empty, making it impossible to provide meaningful analysis of the content or implications.

AINeutralOpenAI News · Jan 231/107
🧠

Scaling laws for neural language models

The article title references scaling laws for neural language models, which are fundamental principles governing how AI model performance improves with increased computational resources, data, and model size. However, no article body content was provided for analysis.

← PrevPage 2 of 2