y0news
← Feed
Back to feed
🧠 AI🟢 BullishImportance 7/10

Polynomial, trigonometric, and tropical activations

arXiv – CS AI|Ismail Khalfaoui-Hassani, Stefan Kesselheim||4 views
🤖AI Summary

Researchers developed new activation functions for deep neural networks based on polynomial and trigonometric orthonormal bases that can successfully train models like GPT-2 and ConvNeXt. The work addresses gradient problems common with polynomial activations and shows these networks can be interpreted as multivariate polynomial mappings.

Key Takeaways
  • New activation functions based on Hermite polynomial, Fourier trigonometric, and tropical polynomial bases can train deep neural networks effectively.
  • The approach solves exploding and vanishing gradient problems typically associated with polynomial activations through variance-preserving initialization.
  • Successfully demonstrated training of GPT-2 for text prediction and ConvNeXt for image classification using these novel activations.
  • Networks with polynomial activations can be mathematically interpreted as multivariate polynomial mappings, providing new structural insights.
  • The activations can approximate classical ones in pre-trained models using Hermite interpolation, making them useful for fine-tuning tasks.
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles