Why Do LLMs Struggle in Strategic Play? Broken Links Between Observations, Beliefs, and Actions
Researchers have identified critical vulnerabilities in how large language models make strategic decisions under incomplete information, revealing gaps between their internal beliefs and external reasoning. The study demonstrates that LLMs encode more accurate hidden beliefs than they express verbally, but these beliefs are brittle and degrade with multi-hop reasoning, raising significant concerns about deploying LLMs in high-stakes decision-making scenarios without safeguards.
The research addresses a fundamental blind spot in AI deployment: LLMs perform strategic reasoning tasks with apparent competence, yet fail in ways that remain opaque to users and developers. By examining the internal mechanisms of models like Llama 3.1 and Qwen3, researchers discovered that models maintain more sophisticated situational understanding than their verbal outputs suggest, exposing a disconnect between what models actually 'know' and what they communicate.
This finding builds on growing concerns about AI alignment and interpretability. As LLMs are increasingly integrated into high-stakes domains—from financial trading to policy advisory—understanding these failure modes becomes critical. The observation-belief gap is particularly troubling: models that appear confident in their reasoning may harbor brittle, incoherent internal states vulnerable to cognitive biases similar to human reasoning errors. Multi-hop reasoning deteriorates accuracy, and models exhibit primacy-recency biases that could lead to suboptimal decisions.
The belief-action gap compounds these problems. Even when internal beliefs are accurate, the conversion process into actionable outputs proves unreliable. Neither explicit belief conditioning nor implicit internal beliefs consistently drive better performance, suggesting fundamental friction in how LLMs translate understanding into decisions.
For enterprises and developers deploying LLMs in strategic contexts, this research underscores the necessity for robust guardrails and human oversight. The vulnerabilities identified warrant caution against fully autonomous LLM deployment in negotiation, trading, or policy-making. Future work should focus on techniques to stabilize internal belief representations and improve belief-to-action coherence before these systems operate independently in consequential domains.
- →LLMs maintain hidden beliefs substantially more accurate than their verbal statements, yet these beliefs are brittle and susceptible to degradation over extended reasoning.
- →Multi-hop reasoning, primacy-recency biases, and drift from Bayesian coherence undermine LLM decision-making reliability in incomplete-information scenarios.
- →Internal beliefs fail to consistently improve game payoffs compared to externalized beliefs, indicating a fundamental belief-to-action conversion gap.
- →Strategic deployment of LLMs in negotiation, trading, or policymaking requires robust guardrails and human oversight given these systematic vulnerabilities.
- →Analyzing LLM internal processes reveals failure modes that remain invisible to external evaluation metrics and user-facing outputs.