AINeutralarXiv – CS AI · 9h ago6/10
🧠
SubtleMemory: A Benchmark for Fine-Grained Relational Memory Discrimination in Long-Horizon AI Agents
Researchers introduce SubtleMemory, a benchmark for evaluating how AI agents handle complex relational memory tasks across long-term interactions. Testing six memory systems and multiple agent architectures reveals current systems struggle with fine-grained memory discrimination, exposing weaknesses in preserving and retrieving nuanced relationships between stored information.