🧠 AI🔴 BearishImportance 6/10

Language Model Goal Selection Differs from Humans' in an Open-Ended Task

arXiv – CS AI|Gaia Molinaro, Dave August, Danielle Perszyk, Anne G. E. Collins|March 5, 2026 at 05:00 AM

🤖AI Summary

Research comparing four state-of-the-art language models (GPT-5, Gemini 2.5 Pro, Claude Sonnet 4.5, and Centaur) to humans in goal selection tasks reveals substantial divergence in behavior. While humans explore diverse approaches and learn gradually, the AI models tend to exploit single solutions or show poor performance, raising concerns about using current LLMs as proxies for human decision-making in critical applications.

Key Takeaways

→Four major language models showed substantial divergence from human behavior in goal selection tasks.
→AI models tend to exploit single solutions (reward hacking) while humans explore diverse approaches.
→Even Centaur, specifically trained to emulate humans, poorly captured human goal selection patterns.
→Chain-of-thought reasoning and persona steering provided only limited improvements in human-like behavior.
→Findings caution against replacing human decision-making with current AI models in personal assistance, scientific discovery, and policy research.

Mentioned in AI

Models

ClaudeAnthropic

GeminiGoogle