AIBullisharXiv – CS AI · Jun 117/10
🧠Researchers formalize the theoretical foundations of LLM scaling laws by modeling transformer learning dynamics as differential equations, establishing matching upper and lower bounds that characterize a two-phase convergence pattern: exponential decay during optimization followed by power-law decay during the statistical phase. This work bridges the gap between empirical observations and rigorous mathematical theory, providing independent scaling relationships for model size, training time, and dataset size.
AINeutralarXiv – CS AI · May 287/10
🧠A comprehensive academic resource presenting the unified mathematical foundations of diffusion models, explaining how three complementary perspectives—variational, score-based, and flow-based—emerge from shared principles. The work bridges theoretical understanding with practical applications including controllable generation and efficient sampling methods.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers introduce Prime Fourier Embeddings (PFE), a neural representation method that encodes integers using prime-indexed trigonometric pairs to expose algebraic structure in modular arithmetic. The approach achieves perfect accuracy on modular tasks with specialized neural channels corresponding to individual primes, validated through ablation studies showing 500x specialization ratios between relevant and irrelevant channels.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers have formulated Transformer data propagation as a nonlinear control system and proven that Gaussian distributions remain Gaussian through the network's layers. This reduces infinite-dimensional dynamics to finite-dimensional equations governing mean and covariance evolution, connecting Transformer expressiveness to classical control theory and revealing conditions for stability or divergence.
AIBullisharXiv – CS AI · Jun 96/10
🧠Researchers introduce Deep Tree Tensor Networks (DTTN), a novel neural architecture originating from quantum physics that captures exponential-order feature interactions for image recognition. The model demonstrates superior performance across multiple benchmarks while maintaining parameter efficiency through tree-like topology, potentially advancing interpretable AI research.
AINeutralarXiv – CS AI · Jun 46/10
🧠Researchers provide a rigorous mathematical framework showing how Active Inference and Expected Free Energy (EFE) minimization can be decomposed into Variational Free Energy (VFE) minimization with explicit entropy corrections. The work clarifies the theoretical foundations of EFE-based planning by identifying which corrections are necessary for different decision-making scenarios, demonstrated through grid-world experiments.
AINeutralarXiv – CS AI · Jun 25/10
🧠A mathematical research paper proposes that deep learning models can be understood through tame geometry (o-minimality), a mathematical framework that enables convergence guarantees for stochastic gradient descent in nonsmooth, nonconvex settings. This perspective offers a formal mathematical foundation for analyzing AI system behavior and training stability.
AINeutralarXiv – CS AI · May 286/10
🧠A comprehensive academic survey examines how optimal transport and diffusion methods provide unified mathematical frameworks for solving machine learning problems involving time-evolving probability distributions. The research highlights applications across generative AI, neural network optimization, and large language model dynamics, offering computational and theoretical advantages through Lagrangian vector field representations.
AINeutralarXiv – CS AI · May 276/10
🧠Researchers propose a mathematical framework for autonomous AI agents that implements per-action insurance premiums based on counterfactual risk assessment against safe defaults. The system replaces traditional post-hoc liability coverage with real-time transaction-level risk tolls, establishing formal guarantees for runtime safety and budget constraints.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers prove that modern neural networks can be represented using a Generalized Singular Value Decomposition that makes them left-invertible before a final linear layer while preserving norm properties. This mathematical framework enables distance calibration between feature space and input space, with demonstrated applications to adversarial perturbation detection and potential future use in addressing model bias and invertibility.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers present a comprehensive mathematical framework unifying generalized Euler logarithms with applications to machine learning optimization. The work establishes theoretical foundations for deformed exponential functions and introduces new algorithms—Generalized Exponentiated Gradient and Mirror Descent schemes—alongside an Euler-based loss function for neural networks that integrates with natural gradient descent.