🧠 AI🟢 BullishImportance 7/10

Polynomial, trigonometric, and tropical activations

arXiv – CS AI|Ismail Khalfaoui-Hassani, Stefan Kesselheim|March 3, 2026 at 05:00 AM|4 views

🤖AI Summary

Researchers developed new activation functions for deep neural networks based on polynomial and trigonometric orthonormal bases that can successfully train models like GPT-2 and ConvNeXt. The work addresses gradient problems common with polynomial activations and shows these networks can be interpreted as multivariate polynomial mappings.

Key Takeaways

→New activation functions based on Hermite polynomial, Fourier trigonometric, and tropical polynomial bases can train deep neural networks effectively.
→The approach solves exploding and vanishing gradient problems typically associated with polynomial activations through variance-preserving initialization.
→Successfully demonstrated training of GPT-2 for text prediction and ConvNeXt for image classification using these novel activations.
→Networks with polynomial activations can be mathematically interpreted as multivariate polynomial mappings, providing new structural insights.
→The activations can approximate classical ones in pre-trained models using Hermite interpolation, making them useful for fine-tuning tasks.