🧠 AI⚪ NeutralImportance 6/10

To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

arXiv – CS AI|Nanxu Gong, Haotian Li, Sixun Dong, Jianxun Lian, Yanjie Fu, Xing Xie|March 3, 2026 at 05:00 AM|4 views

🤖AI Summary

A research study of nine advanced Large Language Models reveals that Large Reasoning Models (LRMs) do not consistently outperform non-reasoning models on Theory of Mind tasks, which assess social cognition abilities. The study found that longer reasoning often hurts performance and models rely on shortcuts rather than genuine deduction, suggesting formal reasoning advances don't transfer to social reasoning tasks.

Key Takeaways

→Large Reasoning Models do not consistently outperform non-reasoning models on Theory of Mind benchmarks and sometimes perform worse.
→Accuracy drops significantly as responses grow longer, with larger reasoning budgets actually hurting performance in social cognition tasks.
→Models show reliance on option matching shortcuts rather than genuine deductive reasoning when solving Theory of Mind problems.
→Moderate and adaptive reasoning approaches can benefit performance when reasoning length is properly constrained.
→Advances in formal reasoning capabilities for math and coding do not fully transfer to social reasoning tasks like Theory of Mind.

#large-language-models #theory-of-mind #reasoning-models #social-cognition #ai-research #model-evaluation #cognitive-abilities

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AI4d ago

S&P 500 surpasses 7,000 amid AI, tech stock surge

AIApr 3

Nvidia (NVDA) Stock Gains Momentum as H100 Rental Costs Jump 40% Amid Supply Crunch

AIMar 31

To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

S&P 500 surpasses 7,000 amid AI, tech stock surge

Nvidia (NVDA) Stock Gains Momentum as H100 Rental Costs Jump 40% Amid Supply Crunch

Salesforce announces an AI-heavy makeover for Slack, with 30 new features