Emergent Causal-Geometric Dynamics Across Depth in Large Language Models
Researchers have synthesized geometric and causal analysis approaches to explain how large language models transform context into predictions across layers, identifying a sharp computational transition in decoder-only LLMs and revealing that angular structure in late layers governs token prediction while representation norms operate independently.
This research advances mechanistic interpretability of large language models by bridging two previously separate analytical frameworks: geometric analysis of representations and causal intervention studies. The findings reveal that LLMs undergo a distinct computational phase transition partway through their depth, where processing shifts from context handling to prediction formation, complemented by gradual geometric reorganization. Understanding these layer-wise dynamics matters significantly because it challenges the assumption that individual layers function independently, instead demonstrating that prediction control emerges from the network's overall dynamical structure. The identification of late-layer geometric codes—where angular relationships between representations correspond to token probability distributions—provides concrete mechanistic insights that could improve model interpretability and control. This has implications for AI safety and alignment researchers who need to understand how to reliably intervene on model behavior. The work particularly benefits the interpretability community by providing a unified framework that explains previously conflicting findings from purely geometric or purely causal approaches. For practitioners developing AI systems, these insights suggest that effective layer-wise interventions require understanding the broader network context rather than treating layers as isolated components. The decoupling between representation norms and prediction formation also indicates that different aspects of learned representations serve distinct functional purposes, informing future approaches to model compression, pruning, or fine-tuning strategies.
- →LLMs exhibit a sharp computational transition from context-processing to prediction-forming across depth, with gradual geometric reorganization
- →Angular structure in late-layer representations directly parameterizes next-token probability distributions and enables selective causal control
- →Representation norms encode information largely decoupled from token prediction, suggesting redundant or alternative functional roles
- →Layer-wise function cannot be understood or effectively modified in isolation from the network's emergent global dynamical structure
- →Synthesizing geometric and causal perspectives provides mechanistic explanations that reconcile previously contradictory interpretability findings