🧠 AI🔴 BearishImportance 7/10

Fragile Preferences: A Deep Dive Into Order Effects in Large Language Models

arXiv – CS AI|Haonan Yin, Shai Vardi, Vidyanand Choudhary|April 15, 2026 at 04:00 AM

🤖AI Summary

Researchers conducted the first systematic study of order bias in Large Language Models used for high-stakes decision-making, finding that LLMs exhibit strong position effects and previously undocumented name biases that can lead to selection of strictly inferior options. The study reveals distinct failure modes in AI decision-support systems, with proposed mitigation strategies using temperature parameter adjustments to recover underlying preferences.

Analysis

This research exposes a critical vulnerability in deploying LLMs for consequential decisions in hiring and admissions. The study demonstrates that model preferences are fragile and context-dependent rather than stable, with order effects varying based on option quality—a counterintuitive finding suggesting models lack robust decision-making frameworks. The discovery of name bias independent of demographic signals introduces additional concerns about fairness in AI-driven selection processes.

The systematic nature of these biases reflects broader challenges in AI alignment and interpretability. Prior work noted position effects anecdotally, but this comprehensive analysis across multiple models and domains establishes that order bias is structural rather than superficial. The framework distinguishing between robust, fragile, and indifferent preferences provides a useful tool for understanding when AI recommendations can be trusted versus when they represent distorted judgments.

For organizations deploying LLMs in hiring and admissions, these findings warrant immediate review of decision-support workflows. The identification that models can select objectively inferior options when order effects interfere creates liability risks and fairness concerns. The proposed mitigation strategies—particularly novel temperature-based approaches—offer practical remedies, though they require validation across diverse contexts.

The research highlights that LLM failure modes differ fundamentally from human decision-making patterns, suggesting current benchmark testing may inadequately assess real-world performance. As AI systems increasingly mediate high-stakes opportunities, establishing robust preference detection methods and understanding systematic biases becomes essential for maintaining institutional integrity and user trust in these systems.

Key Takeaways

→LLMs exhibit strong order effects in decision-making that vary based on option quality, favoring first options when quality is high but later options when quality is low
→Name bias exists independently of demographic signals, suggesting additional fairness concerns beyond position effects in AI hiring and admissions systems
→Models can select objectively inferior options when order effects distort behavior, representing a distinct failure mode not documented in human decision-making
→The fragile preference framework distinguishes between robust, fragile, and indifferent choices to identify when AI recommendations are genuinely trustworthy
→Temperature parameter adjustment offers a targeted mitigation strategy to recover underlying preferences when order effects compromise model judgment