Mutation Without Variation: Convergence Dynamics in LLM-Driven Program Evolution
Researchers demonstrate that Large Language Models exhibit systematic convergence bias when mutating programs, revisiting similar structural forms in 87% of cases despite stochastic variation. This reveals a fundamental tension in LLM-driven program evolution: while these models excel at semantics-aware transformations, they inherently constrain exploration toward restricted regions of program space, limiting their effectiveness for open-ended evolutionary search.
This research exposes a critical limitation in applying LLMs to automated program synthesis and genetic programming tasks. The study systematically analyzes mutation chains across varying prompts, model families, and stochastic conditions, revealing that LLMs consistently converge toward attractor regions rather than exploring diverse program structures. The 87% revisitation rate of structural forms indicates the models develop implicit biases toward particular syntactic patterns, constraining the search space in ways that classical genetic programming operators do not exhibit.
The convergence phenomenon stems from LLMs' underlying architecture and training dynamics. These models optimize for semantic coherence and grammatical validity, which inadvertently pushes mutations toward familiar, high-probability token sequences. While this capability enables semantically-aware transformations that random mutation cannot achieve, it creates a structural homogeneity problem that fundamentally undermines evolutionary search diversity.
For developers and researchers working on LLM-based code generation and program synthesis, this finding demands architectural reconsideration. Systems relying on LLM mutations for exploration will struggle to escape local optima or discover novel program structures. The variance across prompt wording and model choice suggests mitigation strategies exist—custom prompting, ensemble approaches, or hybrid methods combining LLM transformation with classical mutation operators.
Looking forward, practitioners must either accept limited exploration scope or engineer countermeasures explicitly designed to combat convergence bias. Future work should investigate whether fine-tuning, prompt engineering, or architectural modifications can expand the attractor basin without sacrificing semantic awareness. This research highlights how seemingly-desirable model properties can create hidden constraints in specialized applications.
- →LLMs exhibit systematic convergence bias, revisiting identical structural program forms in 87% of mutation chains despite stochastic variation.
- →Semantic-awareness capabilities that enable valid program transformations simultaneously enforce structural homogeneity, creating an inherent trade-off in LLM-driven evolution.
- →Classical genetic programming operators do not exhibit comparable convergence, indicating the bias is intrinsic to LLM mutation pipelines rather than universal to mutation-based search.
- →Convergence severity varies with prompt design and model selection, suggesting mitigation strategies exist but require explicit engineering.
- →Open-ended program exploration using LLMs requires hybrid approaches combining LLM transformations with diversity-promoting mechanisms beyond current LLM-only designs.