AINeutralarXiv – CS AI · 10h ago7/10
🧠
First-Token Broadcasters: Mechanistic Origins of Language Identity and Distributed Robustness in Transformers
Researchers identify specific attention heads in multilingual language models responsible for language switching errors, revealing that instruction tuning reorganizes these circuits to concentrate language identity signals in early layers. The study demonstrates that language selection operates through a distributed but hierarchical mechanism, with compensation patterns following predictable feedforward cascades rather than global diffusion.