#theoretical-foundations News & Analysis

17 articles tagged with #theoretical-foundations. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

17 articles

AIBullisharXiv – CS AI · Jun 57/10

🧠

Your GFlowNet Secretly Learns an Optimal Transport Plan

Researchers establish a theoretical connection between Generative Flow Networks (GFlowNets) and optimal transport theory, demonstrating that minimum-flow GFlowNets reduce to Kantorovich optimal transport problems. This framework enables GFlowNets to learn optimal transport plans on large graphs through neural parameterization, with experimental validation confirming alignment with exact solvers.

AINeutralarXiv – CS AI · Jun 27/10

🧠

A Fiber Criterion for Representation Identifiability in Supervised Learning

A new theoretical framework formalizes when representation properties in supervised learning can be uniquely identified from input-output behavior alone. The research demonstrates that representation-level claims require additional assumptions beyond predictive performance, as auxiliary information can be added to representations while preserving predictor outputs, fundamentally challenging common assumptions about what supervised learning actually determines.

AINeutralarXiv – CS AI · May 97/10

🧠

Are Flat Minima an Illusion?

A research paper challenges the prevailing assumption that flat minima in neural network loss landscapes improve generalization, arguing instead that 'weakness'—the volume of function-compatible parameter configurations—is the true driver of generalization. The author demonstrates that flatness is reparameterization-dependent and thus not causally responsible for better performance, while weakness remains invariant across different parameterizations.

AINeutralarXiv – CS AI · Jun 236/10

🧠

On the Expressive Power of Weight Quantization in Large Language Models

Researchers establish theoretical limits on weight quantization in large language models, identifying 1.58-bit as the minimum precision threshold before expressive collapse occurs. The study demonstrates that model performance degrades polynomially as quantization bits decrease, providing theoretical foundations for optimizing model compression and inference acceleration techniques.

AINeutralarXiv – CS AI · Jun 236/10

🧠

On the Sparsity-Storage-Accuracy Tradeoff in Parsimoniously Activated Dictionary Learning

Researchers present a theoretical framework for parsimoniously activated dictionary learning (PADL) that constrains the number of active dictionary atoms rather than using traditional element-wise sparsity. The work establishes a probabilistic interpretation of PADL, derives analytical tradeoffs between sparsity, storage, and accuracy, and demonstrates practical improvements in vision and vision-language model inference.

AINeutralarXiv – CS AI · Jun 116/10

🧠

Autoregressive Direct Preference Optimization

Researchers propose Autoregressive Direct Preference Optimization (ADPO), a refined theoretical framework for aligning large language models with human preferences. The innovation explicitly incorporates autoregressive assumptions before applying the Bradley-Terry model, resulting in a mathematically elegant loss function and introducing two distinct length measures—token length and feedback length—for optimizing LLM preference alignment.

AINeutralarXiv – CS AI · Jun 106/10

🧠

A Theory on Flow Matching with Neural Networks

Researchers develop theoretical foundations for flow matching, a generative modeling technique using neural networks, establishing convergence guarantees and generalization bounds that validate the approach through experiments. This work bridges the gap between practical flow-matching implementations and rigorous mathematical theory, demonstrating the reliability of neural network-based conditional velocity fields for generating high-quality samples.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Convergence of Monte Carlo Optimistic Policy Iteration: Beyond Uniform State-Action Updates

Researchers prove that Monte Carlo optimistic policy iteration converges to optimal solutions under more practical conditions than previously known, relaxing the requirement for uniform initialization across the entire state-action space to only requiring uniformity within each state's actions. This theoretical advance enables scalable reinforcement learning implementations when state spaces are large or unknown.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Learning to Execute Graph Algorithms Exactly with Graph Neural Networks

Researchers demonstrate that graph neural networks can learn to execute classical graph algorithms exactly through a two-step training process combining MLPs with NTK theory. The work establishes rigorous theoretical learnability results for distributed computing models and practical algorithms like breadth-first search and Bellman-Ford, advancing understanding of what GNNs can provably learn.

AINeutralarXiv – CS AI · Jun 56/10

🧠

The Score Hamiltonian: Mapping Diffusion Models to Adiabatic Transport

Researchers establish a mathematical correspondence between score-based diffusion models and quantum adiabatic transport, revealing that sampling performance is fundamentally limited by the ratio of score-matching error to spectral gap. This theoretical breakthrough provides new bounds for density reconstruction and principled methods for designing annealing schedules in generative AI systems.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Interpreting FCDNNs via RG on Exponential Family

Researchers establish a theoretical bridge between renormalization group (RG) methods from statistical physics and deep neural network training, proving that optimal DNN parameters correspond to RG fixed points for exponential family distributions. This work extends prior results from discrete to continuous data, providing mathematical foundation for understanding why deep learning effectively extracts features from real-world datasets.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Why Linear Recurrent Memory Works in Partially Observable Reinforcement Learning

Researchers provide theoretical foundations for why linear recurrent neural networks excel as memory units in partially observable reinforcement learning environments. The study demonstrates that linear filters can exactly reproduce belief vectors in hidden Markov models under deterministic conditions and nearly eliminate state ambiguity, offering mathematical justification for their empirical success.

AINeutralarXiv – CS AI · May 286/10

🧠

On the Learnability of Test-Time Adaptation: A Recovery Complexity Perspective

Researchers introduce the first theoretical framework for analyzing test-time adaptation (TTA) in machine learning, establishing recovery complexity bounds that reveal fundamental limits on how quickly models can adapt to non-stationary data streams without labeled data. The work provides mathematical guarantees for TTA learnability and identifies an intrinsic trade-off between adaptivity and information constraints.

AINeutralarXiv – CS AI · May 285/10

🧠

Domain size asymptotics for Markov logic networks

Researchers analyze how Markov logic networks (MLNs) behave as domain size increases, demonstrating that probability distributions determined by MLNs diverge significantly from uniform distributions. The work provides asymptotic characterization for single-relation languages and proves fundamental differences exist between MLNs and lifted Bayesian networks in their distributional properties.

AINeutralarXiv – CS AI · May 276/10

🧠

On the Detection of Commutative Factors in Factor Graphs: Necessary and Sufficient Conditions

Researchers have identified critical flaws in the state-of-the-art algorithm for detecting commutative factors in factor graphs, a foundational technique for lifted probabilistic inference. The algorithm incorrectly treats a necessary condition as sufficient, potentially producing incorrect results. The authors provide corrected algorithms that maintain efficiency while ensuring correctness.

AINeutralarXiv – CS AI · May 96/10

🧠

Unifying Goal-Conditioned RL and Unsupervised Skill Learning via Control-Maximization

Researchers unify goal-conditioned reinforcement learning (GCRL) and mutual information skill learning (MISL) under a control-maximization framework, proving that diverse unsupervised skills learned through MISL provide theoretical guarantees for downstream goal-reaching tasks. The work establishes formal bounds connecting different pretraining objectives to specific downstream GCRL formulations, providing theoretical justification for RL pretraining strategies.

AINeutralarXiv – CS AI · Mar 36/103

🧠

Theoretical Foundations of Superhypergraph and Plithogenic Graph Neural Networks

Researchers have developed theoretical foundations for SuperHyperGraph Neural Networks (SHGNNs) and Plithogenic Graph Neural Networks, extending traditional graph neural networks to handle complex hierarchical structures and multi-valued attributes. These advanced frameworks aim to better model uncertainty and higher-order interactions in complex networks beyond the capabilities of standard graph neural networks.