AINeutralarXiv – CS AI · Apr 206/10
🧠Researchers introduced 'Mind's Eye,' a benchmark that tests multimodal large language models (MLLMs) on visual reasoning tasks inspired by human intelligence tests. The evaluation reveals a significant gap between human performance (80% accuracy) and leading MLLMs (below 50%), exposing limitations in visuospatial reasoning, visual attention, and conceptual abstraction.
AINeutralarXiv – CS AI · Apr 206/10
🧠Researchers challenge the Uniform Information Density hypothesis in LLM reasoning, finding that high-quality reasoning exhibits locally smooth but globally non-uniform information flow. This counter-intuitive pattern suggests LLMs optimize differently than human communication, with entropy-based metrics effectively predicting reasoning quality across seven benchmarks.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers discovered that large language models exhibit working memory limitations similar to humans, encoding multiple memory items in entangled representations that require interference control rather than direct retrieval. This finding reveals a shared computational constraint between biological and artificial systems, suggesting that working memory capacity may be a fundamental bottleneck in intelligent systems rather than a limitation unique to biological brains.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers propose a human-centered framework for evaluating whether AI systems fail in ways similar to humans by measuring out-of-distribution performance across a spectrum of perceptual difficulty rather than arbitrary distortion levels. Testing this approach on vision models reveals that vision-language models show the most consistent human alignment, while CNNs and ViTs demonstrate regime-dependent performance differences depending on task difficulty.
AINeutralarXiv – CS AI · Apr 136/10
🧠Researchers formalize how agents can use environmental artifacts as external memory to reduce computational requirements in reinforcement learning tasks. The study demonstrates that spatial observations can implicitly serve as memory substitutes, allowing agents to learn effective policies with less internal memory capacity than previously thought necessary.
AIBearishFortune Crypto · Apr 116/10
🧠Psychologists warn that AI automation of routine tasks may harm cognitive health, as mundane work provides necessary mental recovery and default-mode processing. While AI promises productivity gains by eliminating boring work, research suggests these seemingly unproductive tasks are essential for brain function and psychological well-being.
AINeutralarXiv – CS AI · Mar 176/10
🧠Research reveals that Large Language Models struggle with dynamic Theory of Mind tasks, particularly tracking how others' beliefs change over time. While LLMs can infer current beliefs effectively, they fail to maintain and retrieve prior belief states after updates occur, showing patterns consistent with human cognitive biases.
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers propose a new AI learning architecture inspired by human and animal cognition that integrates observational learning and active behavior learning. The framework includes a meta-control system that switches between learning modes, addressing current limitations in autonomous AI learning.
AIBullisharXiv – CS AI · Mar 166/10
🧠Researchers have developed PsyCogMetrics AI Lab, a cloud-based platform that applies psychometric and cognitive science methodologies to evaluate Large Language Models. The platform was created through a three-cycle Action Design Science study and aims to advance AI evaluation methods at the intersection of psychology, cognitive science, and artificial intelligence.
AINeutralarXiv – CS AI · Mar 66/10
🧠Researchers replicated and extended AI introspection studies, finding that large language models detect injected thoughts through two distinct mechanisms: probability-matching based on prompt anomalies and direct access to internal states. The direct access mechanism is content-agnostic, meaning models can detect anomalies but struggle to identify their semantic content, often confabulating high-frequency concepts.
AIBullishMIT News – AI · Jan 145/109
🧠MIT has renamed and expanded its intelligence research initiative to the MIT Siegel Family Quest for Intelligence with support from the Siegel Family Endowment. The program focuses on understanding how brains produce intelligence and developing methods to replicate this intelligence for practical problem-solving applications.
AINeutralarXiv – CS AI · Mar 54/10
🧠Researchers propose a standardized framework for classifying and evaluating memory capabilities in reinforcement learning agents, drawing from cognitive science concepts. The paper addresses confusion around memory terminology in RL and provides practical definitions for different memory types along with robust experimental methodologies.
AINeutralarXiv – CS AI · Mar 34/103
🧠Researchers propose that language models could help address longstanding challenges in cognitive science research, including integration, formalization, and conceptual clarity. The paper suggests AI tools should complement rather than replace human researchers to create more integrative and cumulative cognitive science.
AINeutralarXiv – CS AI · Feb 274/105
🧠Researchers propose using category theory to formalize knowledge domains and construct analogies between different fields. The paper demonstrates this approach using the classic analogy between the solar system and hydrogen atom, showing how mathematical structures like functors and pullbacks can define analogical relationships.
$ATOM
AINeutralarXiv – CS AI · Mar 34/105
🧠Researchers analyzed how Large Language Models access semantic memory using the Semantic Fluency Task, finding that LLMs exhibit similar memory foraging patterns to humans. The study reveals convergent and divergent search strategies in LLMs that mirror human cognitive behavior, potentially enabling better human-AI alignment or productive cognitive disalignment.