y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#memory News & Analysis

11 articles tagged with #memory. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

11 articles
AIBullisharXiv โ€“ CS AI ยท Apr 77/10
๐Ÿง 

Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

Research published on arXiv demonstrates that large language models playing poker can develop sophisticated Theory of Mind capabilities when equipped with persistent memory, progressing to advanced levels of opponent modeling and strategic deception. The study found memory is necessary and sufficient for this emergent behavior, while domain expertise enhances but doesn't gate ToM development.

๐Ÿง  GPT-4
AIBullisharXiv โ€“ CS AI ยท Mar 56/10
๐Ÿง 

Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning

Researchers introduce MIKASA, a comprehensive benchmark suite designed to evaluate memory capabilities in reinforcement learning agents, particularly for robotic manipulation tasks. The framework includes MIKASA-Base for general memory RL evaluation and MIKASA-Robo with 32 specialized tasks for tabletop robotic manipulation scenarios.

AINeutralarXiv โ€“ CS AI ยท Mar 46/104
๐Ÿง 

Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory

Researchers analyzed memory systems in LLM agents and found that retrieval methods are more critical than write strategies for performance. Simple raw chunk storage matched expensive alternatives, suggesting current memory pipelines may discard useful context that retrieval systems cannot compensate for.

AINeutralarXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

Dynamic Theory of Mind as a Temporal Memory Problem: Evidence from Large Language Models

Research reveals that Large Language Models struggle with dynamic Theory of Mind tasks, particularly tracking how others' beliefs change over time. While LLMs can infer current beliefs effectively, they fail to maintain and retrieve prior belief states after updates occur, showing patterns consistent with human cognitive biases.

AINeutralarXiv โ€“ CS AI ยท Mar 36/108
๐Ÿง 

Transformers Remember First, Forget Last: Dual-Process Interference in LLMs

Research analyzing 39 large language models reveals they exhibit proactive interference (remembering early information over recent) unlike humans who typically show retroactive interference. The study found this pattern is universal across all tested LLMs, with larger models showing better resistance to retroactive interference but unchanged proactive interference patterns.

AIBullishGoogle Research Blog ยท Dec 46/107
๐Ÿง 

Titans + MIRAS: Helping AI have long-term memory

The article discusses Titans + MIRAS technology designed to provide AI systems with long-term memory capabilities. This development aims to address current limitations in AI memory retention and could enhance AI performance across various applications.

AIBullishLil'Log (Lilian Weng) ยท Jun 236/10
๐Ÿง 

LLM Powered Autonomous Agents

The article explores LLM-powered autonomous agents that use large language models as core controllers, going beyond text generation to serve as general problem solvers. Key systems like AutoGPT, GPT-Engineer, and BabyAGI demonstrate the potential of agents with planning, memory, and tool-use capabilities.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Researchers propose a standardized framework for classifying and evaluating memory capabilities in reinforcement learning agents, drawing from cognitive science concepts. The paper addresses confusion around memory terminology in RL and provides practical definitions for different memory types along with robust experimental methodologies.

AINeutralarXiv โ€“ CS AI ยท Mar 34/107
๐Ÿง 

RMBench: Memory-Dependent Robotic Manipulation Benchmark with Insights into Policy Design

Researchers introduced RMBench, a simulation benchmark for evaluating memory-dependent robotic manipulation tasks, addressing gaps in existing policies that struggle with historical reasoning. The study includes 9 manipulation tasks and proposes Mem-0, a modular policy designed to provide insights into how architectural choices affect memory performance in robotic systems.