y0news
#ai-assistants1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 6h ago2
๐Ÿง 

According to Me: Long-Term Personalized Referential Memory QA

Researchers introduce ATM-Bench, the first benchmark for evaluating AI assistants' ability to recall and reason over long-term personalized memory across multiple modalities. The benchmark reveals poor performance (under 20% accuracy) for current state-of-the-art memory systems, highlighting significant limitations in personalized AI capabilities.