The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models
Researchers demonstrate that large language models encode social role granularity—from individual to institutional perspectives—as a structured geometric axis in their internal representations. Using activation steering, they show this axis is causally manipulable, enabling controlled shifts in response scope across different models.
This research reveals a fundamental structural property of how large language models organize social reasoning internally. Rather than treating role-taking as a surface-level stylistic choice, the findings show that granularity exists as a quantifiable, organized latent direction within model representations. The Granularity Axis accounts for over half the variance in role representation space and remains stable across architectural variations and prompt changes—suggesting it reflects core model behavior rather than artifacts.
The work builds on growing evidence that LLMs develop interpretable internal structure corresponding to semantic dimensions. Previous research identified similar directional organization for concepts like sentiment or factuality; this study extends that pattern to organizational and institutional reasoning levels. The consistency across Qwen and Llama models indicates this property emerges as a general feature of instruction-tuned LLMs.
For AI developers and safety researchers, the ability to causally steer granularity through activation steering opens practical applications. Organizations could ensure models maintain appropriate scope when role-playing as institutional entities, or conversely, elicit detailed individual perspectives when needed. The observation that model controllability differs between architectures suggests designers must understand each model's default operating regime to implement effective steering.
Future work should explore whether other semantic dimensions show similarly structured organization and whether steering generalizes across diverse model families. Understanding these latent directions could improve alignment efforts by enabling more precise control over model behavior without retraining. The transferability of findings across models hints at potentially universal principles underlying how language models internally represent conceptual hierarchies.
- →Granularity of social roles is encoded as a dominant geometric axis in LLM hidden states, explaining over 52% of role representation variance.
- →The axis remains stable across layers, prompt variants, and transfers between different model architectures including Qwen and Llama.
- →Activation steering along the granularity axis causally shifts model responses from micro to macro perspectives, demonstrating causal relevance beyond correlation.
- →Different models show varying controllability despite similar underlying structure, suggesting steering effectiveness depends on each model's training regime.
- →This structured latent direction indicates social role granularity is not a stylistic surface feature but a core organizational principle in LLM cognition.