AIBullisharXiv โ CS AI ยท 5h ago
๐ง
Data-Aware Random Feature Kernel for Transformers
Researchers introduce DARKFormer, a new transformer architecture that reduces computational complexity from quadratic to linear while maintaining performance. The model uses data-aware random feature kernels to address efficiency issues in pretrained transformer models with anisotropic query-key distributions.