y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#theoretical-ai News & Analysis

6 articles tagged with #theoretical-ai. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

6 articles
AIBullisharXiv – CS AI Β· Mar 46/103
🧠

On the Expressive Power of Transformers for Maxout Networks and Continuous Piecewise Linear Functions

Researchers establish theoretical foundations for Transformer networks' expressive power by connecting them to maxout networks and continuous piecewise linear functions. The study proves Transformers inherit universal approximation capabilities of ReLU networks while revealing that self-attention layers implement max-type operations and feedforward layers perform token-wise affine transformations.

AINeutralarXiv – CS AI Β· Mar 47/103
🧠

Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers

Researchers introduce a theoretical framework connecting Kolmogorov complexity to Transformer neural networks through asymptotically optimal description length objectives. The work demonstrates computational universality of Transformers and proposes a variational objective that achieves optimal compression, though current optimization methods struggle to find such solutions from random initialization.

AINeutralarXiv – CS AI Β· Feb 277/106
🧠

On the Complexity of Neural Computation in Superposition

Researchers establish theoretical foundations for neural network superposition, proving lower bounds that require at least Ω(√m' log m') neurons and Ω(m' log m') parameters to compute m' features. The work demonstrates exponential complexity gaps between computing versus merely representing features and provides first subexponential bounds on network capacity.

AINeutralarXiv – CS AI Β· Feb 277/105
🧠

On the Equivalence of Random Network Distillation, Deep Ensembles, and Bayesian Inference

Researchers establish theoretical connections between Random Network Distillation (RND), deep ensembles, and Bayesian inference for uncertainty quantification in deep learning models. The study proves that RND's uncertainty signals are equivalent to deep ensemble predictive variance and can mirror Bayesian posterior distributions, providing a unified theoretical framework for efficient uncertainty quantification methods.

AINeutralarXiv – CS AI Β· Mar 37/109
🧠

Universal NP-Hardness of Clustering under General Utilities

Researchers prove that clustering problems in machine learning are universally NP-hard, providing theoretical explanation for why clustering algorithms often produce unstable results. The study demonstrates that major clustering methods like k-means and spectral clustering inherit fundamental computational intractability, explaining common failure modes like local optima.

AINeutralarXiv – CS AI Β· Feb 274/107
🧠

From Shallow Bayesian Neural Networks to Gaussian Processes: General Convergence, Identifiability and Scalable Inference

Researchers established a new theoretical framework connecting Bayesian neural networks to Gaussian processes, developing improved convergence results and identifiability properties. They introduced a scalable computational method using NystrΓΆm approximation for training and prediction, demonstrating competitive performance on real-world datasets.