#human-ai-comparison News & Analysis

7 articles tagged with #human-ai-comparison. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

7 articles

AIBearisharXiv – CS AI · Mar 56/10

🧠

Language Model Goal Selection Differs from Humans' in an Open-Ended Task

Research comparing four state-of-the-art language models (GPT-5, Gemini 2.5 Pro, Claude Sonnet 4.5, and Centaur) to humans in goal selection tasks reveals substantial divergence in behavior. While humans explore diverse approaches and learn gradually, the AI models tend to exploit single solutions or show poor performance, raising concerns about using current LLMs as proxies for human decision-making in critical applications.

🧠 Claude🧠 Gemini

AIBearisharXiv – CS AI · Apr 66/10

🧠

High Volatility and Action Bias Distinguish LLMs from Humans in Group Coordination

Research comparing large language models (LLMs) to humans in group coordination tasks reveals that LLMs exhibit excessive volatility and switching behavior that impairs collective performance. Unlike humans who adapt and stabilize over time, LLMs fail to improve across repeated coordination games and don't benefit from richer feedback mechanisms.

AIBullisharXiv – CS AI · Mar 36/104

🧠

When AI Gives Advice: Evaluating AI and Human Responses to Online Advice-Seeking for Well-Being

A research study comparing AI-generated advice to human Reddit responses found that large language models like GPT-4o significantly outperformed crowd-sourced advice on effectiveness, warmth, and user satisfaction metrics. The study suggests human advice can be enhanced through AI polishing, pointing toward hybrid systems combining AI, crowd input, and expert oversight.

AIBearisharXiv – CS AI · Mar 26/1013

🧠

Humans and LLMs Diverge on Probabilistic Inferences

Researchers created ProbCOPA, a dataset testing probabilistic reasoning in humans versus AI models, finding that state-of-the-art LLMs consistently fail to match human judgment patterns. The study reveals fundamental differences in how humans and AI systems process non-deterministic inferences, highlighting limitations in current AI reasoning capabilities.

AINeutralIEEE Spectrum – AI · Feb 126/103

🧠

ChatGPT’s Translation Skills Parallel Most Human Translators

A new study published in IEEE Transactions on Big Data found that ChatGPT's GPT-4 model performs at the level of junior and medium-level human translators, marking potentially the first time an AI algorithm has reached human-level translation quality. Only senior translators with 10+ years of experience and professional certification clearly outperformed the AI models.

AINeutralarXiv – CS AI · Feb 274/108

🧠

Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus

Researchers introduced CogARC, a human-adapted subset of the Abstraction and Reasoning Corpus, to study how humans solve abstract visual reasoning problems. In experiments with 260 participants solving 75 problems, researchers found high success rates (~80-90%) but significant variation in problem difficulty and solution strategies.

AINeutralGoogle DeepMind Blog · Nov 114/106

🧠

Teaching AI to see the world more like we do

A new research paper examines how AI systems perceive and organize visual information differently from humans. The study analyzes the fundamental differences in visual processing between artificial intelligence and human cognition.