#learning-theory News & Analysis

5 articles tagged with #learning-theory. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles

AIBullisharXiv – CS AI · Jun 117/10

🧠

Unifying Learning Dynamics and Generalization in Transformers Scaling Law

Researchers formalize the theoretical foundations of LLM scaling laws by modeling transformer learning dynamics as differential equations, establishing matching upper and lower bounds that characterize a two-phase convergence pattern: exponential decay during optimization followed by power-law decay during the statistical phase. This work bridges the gap between empirical observations and rigorous mathematical theory, providing independent scaling relationships for model size, training time, and dataset size.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Robust Shielding for Safe Reinforcement Learning

Researchers introduce a novel shielding framework for reinforcement learning agents that guarantees safety without requiring prior knowledge of system dynamics. By combining robust MDPs with linear temporal logic specifications and PAC learning guarantees, the approach enables the creation of minimally restrictive safety shields for unknown environments while maintaining strong performance as data accumulates.

AINeutralarXiv – CS AI · May 295/10

🧠

On Language Generation in the Limit with Bounded Memory

This theoretical computer science paper investigates language generation under bounded memory constraints, extending classical learning theory to a practical setting where algorithms cannot retain complete historical information. The research characterizes when language generation remains possible with various memory limitations and reveals that bounded memory affects different learning tasks—generation, density optimization, and identification—in fundamentally different ways.

AINeutralarXiv – CS AI · May 276/10

🧠

A Sharper Picture of Generalization in Transformers

Researchers present a new theoretical framework for understanding how transformers generalize on boolean functions using PAC-Bayes theory and Fourier spectral analysis. The work provides non-vacuous generalization bounds for transformers and offers formal explanations for why chain-of-thought reasoning improves performance on complex tasks.

AINeutralarXiv – CS AI · May 116/10

🧠

Spectral Filtering for Complex Linear Dynamical Systems

Researchers introduce a spectral filtering method for learning complex-valued linear dynamical systems with sector-bounded spectrum, achieving dimension-free regret bounds for sequence prediction. The approach uses Slepian basis functions and demonstrates that learning efficiency depends on an effective dimension independent of state space size, with applications to signal processing and quantum systems.