AINeutralarXiv – CS AI · 7h ago6/10
🧠
Emergent Hierarchical Structure in Large Language Models: An Information-Theoretic Framework for Multi-Scale Representation
Researchers reveal that large language models develop distinct hierarchical processing stages (Local, Intermediate, Global) determined by architecture family rather than model size. Using information theory, they demonstrate that Llama and Qwen models show dramatically different brittleness patterns across layers, with architectural design — not scaling — as the primary driver of model behavior.
🧠 Llama