#mathematical-theory News & Analysis

4 articles tagged with #mathematical-theory. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles

AINeutralarXiv – CS AI · Apr 147/10

🧠

A Mathematical Explanation of Transformers

Researchers propose a novel mathematical framework interpreting Transformers as discretized integro-differential equations, revealing self-attention as a non-local integral operator and layer normalization as time-dependent projection. This theoretical foundation bridges deep learning architectures with continuous mathematical modeling, offering new insights for architecture design and interpretability.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Minimum Distortion Quantization with Specified Output Distribution

Researchers have developed a mathematical framework for optimal quantization that constrains output distributions while minimizing mean squared error. This theoretical advance has practical applications in entropy control, mutual information maximization, communication systems, and privacy-preserving data anonymization.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Gradient descent at the Edge of Stability: free energy model and kinetic description of the two-layer network

Researchers propose a continuous-time mathematical model for analyzing gradient descent dynamics in the Edge of Stability regime, where large learning rates cause oscillations in neural network training. The model introduces an effective free energy framework that combines risk with a curvature-related term, enabling better prediction of training dynamics in wide two-layer networks and validated on matrix factorization and CIFAR-10 tasks.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Detecting Invariant Manifolds in ReLU-Based RNNs

Researchers have developed a novel algorithm for detecting invariant manifolds in ReLU-based recurrent neural networks (RNNs), enabling analysis of dynamical system behavior through topological and geometrical properties. The method identifies basin boundaries, multistability, and chaotic dynamics, with applications to scientific computing and explainable AI.