#memory-benchmarks News & Analysis

2 articles tagged with #memory-benchmarks. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AINeutralarXiv – CS AI · Apr 146/10

🧠

ATANT v1.1: Positioning Continuity Evaluation Against Memory, Long-Context, and Agentic-Memory Benchmarks

ATANT v1.1 is a companion paper clarifying how existing memory and context evaluation benchmarks (LOCOMO, LongMemEval, BEAM, MemoryBench, and others) fail to measure 'continuity' as defined in the original v1.0 framework. The analysis reveals that existing benchmarks cover a median of only 1 out of 7 required continuity properties, and the authors demonstrate a significant measurement gap through comparative scoring: their system achieves 96% on ATANT but only 8.8% on LOCOMO, proving these benchmarks evaluate different capabilities.

AINeutralarXiv – CS AI · Apr 106/10

🧠

A-MBER: Affective Memory Benchmark for Emotion Recognition

Researchers introduce A-MBER, a benchmark dataset designed to evaluate AI assistants' ability to recognize emotions based on long-term interaction history rather than immediate context. The benchmark tests whether models can retrieve relevant past interactions, infer current emotional states, and provide grounded explanations—revealing that memory's value lies in selective, context-aware interpretation rather than simple historical volume.