🧠 AI🟢 BullishImportance 6/10

Training Transformers in Cosine Coefficient Space

arXiv – CS AI|Mohamed Amine Bergach|April 7, 2026 at 04:00 AM

🤖AI Summary

Researchers developed a new method to train transformer neural networks using discrete cosine transform (DCT) coefficients, achieving the same performance while using only 52% of the parameters. The technique requires no architectural changes and simply replaces standard linear layers with spectral layers that store DCT coefficients instead of full weight matrices.

Key Takeaways

→New DCT-based parameterization reduces transformer model size by 48% while maintaining identical performance on language modeling tasks.
→At 4x compression (29% of parameters), the method outperforms low-rank baselines with better perplexity scores.
→The technique requires no pre-trained models, architectural changes, or auxiliary losses - just replacing linear layers.
→Method works by storing only low-frequency DCT coefficients and reconstructing full weight matrices during forward passes.
→Demonstrates practical model compression without performance degradation for resource-constrained AI applications.

Mentioned in AI

Companies

Perplexity→