AINeutralarXiv – CS AI · 6h ago6/10
🧠
Parity, Sensitivity, and Transformers
Researchers have resolved a long-standing theoretical question about transformer neural networks by proving that at least two layers are required to compute the PARITY task (determining if a binary sequence contains an even or odd number of 1s). The study also presents a more practical four-layer transformer construction that works with standard softmax attention and realistic positional encoding, removing previous impractical assumptions.