y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#activation-functions News & Analysis

4 articles tagged with #activation-functions. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Polynomial, trigonometric, and tropical activations

Researchers developed new activation functions for deep neural networks based on polynomial and trigonometric orthonormal bases that can successfully train models like GPT-2 and ConvNeXt. The work addresses gradient problems common with polynomial activations and shows these networks can be interpreted as multivariate polynomial mappings.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

PolyGLU: State-Conditional Activation Routing in Transformer Feed-Forward Networks

Researchers introduce PolyGLU, a new transformer architecture that enables dynamic routing among multiple activation functions, mimicking biological neural diversity. The 597M-parameter PolychromaticLM model shows emergent specialization patterns and achieves strong performance despite training on significantly fewer tokens than comparable models.

๐Ÿข Nvidia
AIBullisharXiv โ€“ CS AI ยท Mar 27/1016
๐Ÿง 

Activation Function Design Sustains Plasticity in Continual Learning

Researchers from arXiv demonstrate that activation function design is crucial for maintaining neural network plasticity in continual learning scenarios. They introduce two new activation functions (Smooth-Leaky and Randomized Smooth-Leaky) that help prevent models from losing their ability to adapt to new tasks over time.

$LINK
AIBullisharXiv โ€“ CS AI ยท Feb 276/108
๐Ÿง 

GRAU: Generic Reconfigurable Activation Unit Design for Neural Network Hardware Accelerators

Researchers propose GRAU, a new reconfigurable activation unit design for neural network hardware accelerators that uses piecewise linear fitting with power-of-two slopes. The design reduces LUT consumption by over 90% compared to traditional multi-threshold activators while supporting mixed-precision quantization and nonlinear functions.