AINeutralarXiv – CS AI · 6h ago6/10
🧠
Revisiting Padded Transformer Expressivity: Which Architectural Choices Matter and Which Don't
Researchers demonstrate that padded transformers maintain consistent computational expressivity across various architectural choices, with numeric precision and model depth emerging as the primary factors determining capability. The findings establish formal equivalences between transformer models and circuit complexity classes, suggesting practical transformer designs are more robust than previously understood.