When Is Emergent Consensus Real? A Measured Coupling Gain and a Validity Diagnostic for LLM Agent Societies
Researchers introduce a measurement framework called 'coupling gain' to quantify whether consensus or polarization in LLM agent societies reflects genuine social dynamics or model artifacts. The study reveals that frontier LLMs do not spontaneously polarize, and that emergent consensus claims must be validated against initial conditions and context-specific coupling metrics rather than assumed theoretical models.
This research addresses a critical gap in AI agent research: the lack of empirical rigor when studying emergent behavior in multi-agent LLM systems. Previous demonstrations of consensus or polarization in agent societies operated without measurable control parameters or diagnostic tests to distinguish real social dynamics from artifacts of model priors and architecture. The introduction of coupling gain—measured through counterfactual perturbation of neighbor opinions—provides a quantifiable diagnostic that remains stable across five frontier models and remains invariant to paraphrasing, establishing it as a genuine evidence measure rather than a superficial social phenomenon.
The findings have substantial implications for AI research credibility. Classical dynamical systems models (Friedkin-Johnsen, signed-Laplacian) successfully organize observed regimes when coefficients are measured empirically rather than assumed theoretically. Critically, the study reveals that frontier LLMs lack spontaneous polarization dynamics (beta ≤ 0), contradicting narratives about AI systems self-reinforcing extreme positions. When polarization appears, it is always induced externally, never emergent. A randomized-initial-condition diagnostic successfully separates genuine averaging behavior from model-prior artifacts, and when applied to published consensus research, it exposes conflation between authentic consensus on debatable claims versus model-specific priors on settled facts.
Context-dependence emerges as a key limitation: pairwise coupling measurements fail to predict multi-agent outcomes and can even reverse outcome ordering. Only modality-matched group coupling demonstrates predictive power across sixteen models. This framework prevents overgeneralization of agent dynamics from simple pairwise interactions to complex multi-agent scenarios. For AI development, the work establishes measurable standards for validating emergent social behavior, reducing false claims about AI consensus capabilities while enabling more rigorous comparative analysis across models.
- →Coupling gain provides a stable, model-distinguishing measurement of agent influence across frontier LLMs, spanning 0.15-0.43 with narrow confidence intervals.
- →Frontier LLMs do not spontaneously polarize; polarization only emerges from external induction, contradicting narratives about self-reinforcing extreme AI behavior.
- →A randomized-initial-condition diagnostic separates genuine emergent consensus from model-prior artifacts, revealing conflation in previously published results.
- →Pairwise coupling measurements fail to predict multi-agent outcomes and can reverse outcome ordering, requiring modality-matched group coupling instead.
- →Classical dynamical systems models (Friedkin-Johnsen, signed-Laplacian) successfully organize agent behavior regimes when coefficients are empirically measured rather than theoretically assumed.