A Shared Valence Axis Across Modern LLMs and Human EEG: The Saturation Regularity
Researchers demonstrate that Large Language Models and human brain activity share a common valence (emotional) axis, with LLMs trained on emotion-evocative sentences producing representations that align with EEG patterns across 123 subjects. However, directly supervising neural networks to match this axis paradoxically degrades performance, leading to a discovery called the 'saturation regularity' that suggests optimal brain decoding requires ensemble methods leveraging residual diversity rather than additional constraint-based training.
This research bridges neuroscience and artificial intelligence by revealing unexpected structural alignment between LLM representations and human neural activity. The study identifies a one-dimensional valence direction from language models that spontaneously emerges in both EEG classifiers and sentiment benchmarks, suggesting emotion processing follows similar organizational principles across synthetic and biological systems. This convergence offers valuable insights into how both systems encode affective information, potentially advancing our understanding of neural computation.
The counterintuitive finding—that additional supervised alignment actually harms decoding performance—represents a significant methodological contribution. The 'saturation regularity' principle reveals that task labels alone saturate the relevant feature space, and further supervision merely corrupts an already-optimized basin. This challenges conventional deep learning wisdom that more training signal improves outcomes, with implications extending beyond neuroscience to general machine learning practice.
For the AI research community, this work demonstrates that LLMs function as effective proxies for understanding human cognition, potentially reducing dependence on expensive neuroimaging studies for validating neural alignment hypotheses. The practical 10.5% improvement in balanced accuracy through residual ensemble methods validates the theoretical insights. This research strengthens the argument that contemporary LLMs capture meaningful aspects of human cognitive structure, not merely statistical patterns in text.
Future directions should explore whether this saturation regularity applies to other brain domains beyond emotion recognition, and whether similar principles explain why scaling language models sometimes exhibits diminishing returns despite increased supervision.
- →LLMs and human EEG share a common emotional valence axis, suggesting alignment between artificial and biological information processing
- →Direct supervision to enforce alignment paradoxically reduces decoding performance, revealing a 'saturation regularity' in neural network training
- →Task labels alone saturate the optimal feature space, making additional supervised constraints counterproductive
- →Ensemble methods leveraging residual diversity improve accuracy by 10.5%, validating the saturation regularity principle
- →The findings suggest LLMs serve as valuable proxies for understanding human neural computation without direct brain data