#length-generalization News & Analysis

2 articles tagged with #length-generalization. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AINeutralarXiv – CS AI · Jun 17/10

🧠

Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization

Researchers analyzing transformer language models discovered that attention heads naturally specialize into either positional (location-based) or symbolic (meaning-based) mechanisms during training. The study reveals that symbolic reasoning mechanisms generalize better to longer sequences than positional ones, with theoretical explanations grounded in RoPE geometry.

AIBullisharXiv – CS AI · Jun 86/10

🧠

Discovering Interpretable Algorithms by Decompiling Transformers to RASP

Researchers present a method to extract interpretable programs from trained Transformers by converting them to RASP (a simple programming language) and using causal interventions to identify minimal sub-programs. Experiments on algorithmic tasks demonstrate that length-generalizing Transformers often implement simple, understandable algorithms internally, providing direct evidence that neural networks discover human-readable solutions.