AINeutralarXiv – CS AI · 6h ago7/10
🧠
Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization
Researchers analyzing transformer language models discovered that attention heads naturally specialize into either positional (location-based) or symbolic (meaning-based) mechanisms during training. The study reveals that symbolic reasoning mechanisms generalize better to longer sequences than positional ones, with theoretical explanations grounded in RoPE geometry.