Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models
Researchers tested whether large language models exhibit the Identifiable Victim Effect (IVE)—a well-documented cognitive bias where people prioritize helping a specific individual over a larger group facing equal hardship. Across 51,955 API trials spanning 16 frontier models, instruction-tuned LLMs showed amplified IVE compared to humans, while reasoning-specialized models inverted the effect, raising critical concerns about AI deployment in humanitarian decision-making.
This research exposes a consequential gap between human and machine moral reasoning at scale. The Identifiable Victim Effect, long documented in psychology, describes how narratives about specific people trigger stronger resource allocation than statistical descriptions of equivalent suffering. The study's finding that instruction-tuned models exhibit IVE at roughly double the human baseline (d=0.223 vs. d≈0.10) suggests current LLM alignment techniques may inadvertently amplify affective biases rather than mitigate them.
The divergence between model types is particularly significant. Reasoning-specialized models like o1 actually invert the effect, allocating fewer resources to individual victims than groups, indicating that explicit reasoning pathways can overcome narrative framing. Standard Chain-of-Thought prompting backfires spectacularly, nearly tripling the effect size, suggesting that intermediate reasoning steps without explicit utilitarian framing entrench bias rather than correct it.
For AI deployment in humanitarian triage, grant evaluation, and content moderation, these findings carry material implications. Organizations implementing LLM-based decision systems face a dilemma: instruction-tuned models are more accessible and widely deployed but harbor amplified decision-making biases. The research documents additional pathologies including psychophysical numbing and quantity neglect, meaning AI systems fail to appropriately scale responses to group size—a fundamental requirement for equitable resource allocation.
The path forward requires deliberate intervention during model development. Utilitarian-focused prompting showed promise, but this cannot be left to deployment-time choices; alignment training itself needs restructuring to prevent bias amplification. As LLMs assume increasingly consequential roles in real-world allocation decisions, this research provides essential evidence for auditing and redesigning these systems before deployment at scale.
- →Instruction-tuned LLMs exhibit the Identifiable Victim Effect at approximately double the human baseline, amplifying narrative bias in resource allocation decisions
- →Reasoning-specialized models invert the effect entirely, suggesting explicit reasoning pathways can overcome narrative framing biases
- →Standard Chain-of-Thought prompting nearly triples the IVE effect size, indicating that intermediate reasoning without utilitarian framing entrenches rather than corrects bias
- →LLMs demonstrate psychophysical numbing and perfect quantity neglect, failing to appropriately scale responses based on group size
- →Current alignment training may inadvertently amplify affective biases rather than mitigate them, raising concerns for AI systems in humanitarian and ethical decision-making