#cognitive-abilities News & Analysis

6 articles tagged with #cognitive-abilities. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

6 articles

AINeutralarXiv – CS AI · Mar 177/10

🧠

The ARC of Progress towards AGI: A Living Survey of Abstraction and Reasoning

A comprehensive survey of 82 AI approaches to the ARC-AGI benchmark reveals consistent 2-3x performance drops across all paradigms when moving from version 1 to 2, with human-level reasoning still far from reach. While costs have fallen dramatically (390x in one year), AI systems struggle with compositional generalization, achieving only 13% on ARC-AGI-3 compared to near-perfect human performance.

🧠 GPT-5🧠 Opus

AINeutralarXiv – CS AI · Mar 47/104

🧠

A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities

Researchers introduced NeuroCognition, a new benchmark for evaluating LLMs based on neuropsychological tests, revealing that while models show unified capability across tasks, they struggle with foundational cognitive abilities. The study found LLMs perform well on text but degrade with images and complexity, suggesting current models lack core adaptive cognition compared to human intelligence.

AIBearisharXiv – CS AI · Mar 266/10

🧠

Visuospatial Perspective Taking in Multimodal Language Models

Research reveals that multimodal language models have significant deficits in visuospatial perspective-taking, particularly in Level 2 VPT which requires adopting another person's viewpoint. The study used two human psychology tasks to evaluate MLMs' ability to understand and reason from alternative spatial perspectives.

AINeutralarXiv – CS AI · Mar 36/104

🧠

To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

A research study of nine advanced Large Language Models reveals that Large Reasoning Models (LRMs) do not consistently outperform non-reasoning models on Theory of Mind tasks, which assess social cognition abilities. The study found that longer reasoning often hurts performance and models rely on shortcuts rather than genuine deduction, suggesting formal reasoning advances don't transfer to social reasoning tasks.

AIBearisharXiv – CS AI · Mar 36/104

🧠

SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs

Researchers introduced SimpleToM, a benchmark revealing that state-of-the-art language models can infer mental states but struggle to apply that knowledge for behavior prediction and judgment. The study exposes a critical gap between explicit Theory of Mind inference and implicit application in real-world scenarios.

AINeutralarXiv – CS AI · Apr 75/10

🧠

When Models Know More Than They Say: Probing Analogical Reasoning in LLMs

Researchers found that large language models (LLMs) have an asymmetry between their internal knowledge and prompted responses when detecting analogies. While probing reveals models understand rhetorical analogies better than their prompted responses suggest, both methods perform poorly on narrative analogies requiring deeper abstraction.