y0news
AnalyticsDigestsRSSAICrypto
#approximation-theory2 articles
2 articles
AIBullisharXiv โ€“ CS AI ยท 5h ago1
๐Ÿง 

On the Expressive Power of Transformers for Maxout Networks and Continuous Piecewise Linear Functions

Researchers establish theoretical foundations for Transformer networks' expressive power by connecting them to maxout networks and continuous piecewise linear functions. The study proves Transformers inherit universal approximation capabilities of ReLU networks while revealing that self-attention layers implement max-type operations and feedforward layers perform token-wise affine transformations.

AINeutralarXiv โ€“ CS AI ยท 5h ago1
๐Ÿง 

Covering Numbers for Deep ReLU Networks with Applications to Function Approximation and Nonparametric Regression

Researchers have derived tight bounds on covering numbers for deep ReLU neural networks, providing fundamental insights into network capacity and approximation capabilities. The work removes a log^6(n) factor from the best known sample complexity rate for estimating Lipschitz functions via deep networks, establishing optimality in nonparametric regression.