🧠 AI⚪ NeutralImportance 6/10

Interpreting FCDNNs via RG on Exponential Family

arXiv – CS AI|Fuzhou Gong, Zigeng Xia|June 2, 2026 at 04:00 AM

🤖AI Summary

Researchers establish a theoretical bridge between renormalization group (RG) methods from statistical physics and deep neural network training, proving that optimal DNN parameters correspond to RG fixed points for exponential family distributions. This work extends prior results from discrete to continuous data, providing mathematical foundation for understanding why deep learning effectively extracts features from real-world datasets.

Analysis

This research addresses a fundamental gap in deep learning interpretability by connecting statistical physics principles to neural network behavior. The authors demonstrate that when fully connected DNNs optimize their parameters during training, the characteristic parameters of feature layer outputs match the fixed points that renormalization group methods produce when applied to input data distributions. This equivalence suggests that DNNs inherently perform a similar mathematical operation to RG calculations, which compress complex systems by identifying invariant features across different scales.

The work builds on previous foundations established with discrete data (the one-dimensional Ising model) and now extends to continuous exponential family distributions, bringing the framework closer to real-world applicability. Exponential family distributions represent a broad class encompassing normal, gamma, exponential, and Poisson distributions—foundational models across statistics and machine learning. By proving this correspondence holds for such distributions, the authors strengthen the theoretical scaffolding supporting DNN interpretability.

The implications reach beyond academic theory into practical understanding. DNNs have achieved remarkable empirical success on real-world data despite incomplete theoretical understanding of their mechanisms. This research explains that success through a physics-inspired lens: networks perform dimensionality reduction and feature extraction by converging to renormalization group fixed points, automatically identifying the most relevant structural features in data while discarding noise and redundancy.

Future research should explore whether this framework extends to other network architectures beyond fully connected networks, particularly convolutional and transformer models that dominate modern applications. Validating these theoretical predictions on practical datasets would demonstrate whether the mathematical correspondence translates into actionable insights for network design and optimization.

Key Takeaways

→DNN training is mathematically equivalent to applying renormalization group methods from statistical physics to input data distributions.
→Optimal DNN parameters converge to fixed points that match characteristic parameters produced by RG calculations on exponential family data.
→This theoretical framework explains how DNNs extract main features from data similarly to how RG methods identify invariant properties across scales.
→Results extend from discrete (Ising model) to continuous data distributions, advancing applicability to real-world datasets.
→The correspondence validates DNN performance and provides foundational theory for understanding deep learning interpretability.