🧠 AI⚪ NeutralImportance 6/10

The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents

arXiv – CS AI|Jiayuan Liu, Tianqin Li, Shiyi Du, Xin Luo, Haoxuan Zeng, Emanuel Tewolde, Tai Sing Lee, Tonghan Wang, Carl Kingsford, Vincent Conitzer|May 11, 2026 at 04:00 AM

🤖AI Summary

A new study reveals that expanding context windows in large language models paradoxically degrades cooperation in multi-agent scenarios, a phenomenon termed the 'memory curse.' Across 7 LLMs and 4 games, researchers found cooperation declined in 18 of 28 settings, with the mechanism traced to eroding forward-looking intent rather than increased paranoia, suggesting memory content fundamentally reshapes agent behavior.

Analysis

This research challenges a widespread assumption in AI development: that expanded memory capacity uniformly improves model performance. The study demonstrates a counterintuitive failure mode where longer context windows correlate with degraded cooperative behavior in multi-agent social dilemmas. The 500-round experiments across multiple models and game types provide robust empirical evidence that this isn't a fringe edge case but a systematic pattern affecting most tested configurations.

The mechanism underlying this degradation differs markedly from naive intuition. Rather than models becoming paranoid or adversarial with more historical information, lexical analysis of 378,000 reasoning traces shows the problem stems from diminished forward-looking intent—agents lose their prospective orientation toward mutual benefit. This distinction carries critical implications for how developers design prompts and train models for cooperative tasks. The research validates this through targeted interventions: fine-tuning on forward-looking traces mitigates decay and transfers to new games, while sanitizing memory with synthetic cooperative records restores cooperation while holding prompt length constant.

For the AI industry, these findings suggest that context window expansion requires careful consideration beyond simple capability metrics. The paradoxical amplification of the problem when using explicit Chain-of-Thought reasoning indicates that reasoning structure interacts with memory in unexpected ways. This affects developers building multi-agent systems for real-world applications—deployment strategies must account for potential cooperation collapse. Researchers now face questions about training methodologies: how can models maintain forward-looking intent as context expands? This opens pathways for novel fine-tuning approaches and architectural innovations that preserve cooperative behavior alongside enhanced memory capacity, potentially reshaping how future LLM systems are optimized for social interaction.

Key Takeaways

→Expanding LLM context windows degrades cooperation in multi-agent scenarios, contradicting assumptions about memory as a straightforward capability upgrade.
→The cooperation collapse stems from eroding forward-looking intent, not increased paranoia, as revealed by analysis of 378,000 reasoning traces.
→Memory sanitization with synthetic cooperative records restores cooperation while holding prompt length constant, proving content rather than length triggers the problem.
→Fine-tuning on forward-looking traces mitigates the memory curse and transfers zero-shot to distinct games, offering a potential mitigation strategy.
→Chain-of-Thought reasoning can paradoxically amplify the memory curse, suggesting reasoning structure interacts unexpectedly with expanded memory.