AINeutralarXiv – CS AI · 8h ago7/10
🧠
M$^3$Eval: Multi-Modal Memory Evaluation through Cognitively-Grounded Video Tasks
Researchers introduce M³Eval, the first comprehensive benchmark for evaluating memory capabilities in multi-modal AI models processing long-form video. Testing across multiple models reveals significant weaknesses in maintaining disentangled representations, handling temporal information, and symbolic memory—highlighting memory as a critical yet understudied dimension of AI development.