y0news
← Feed
Back to feed
🧠 AI Neutral

According to Me: Long-Term Personalized Referential Memory QA

arXiv – CS AI|Jingbiao Mei, Jinghong Chen, Guangyu Yang, Xinyu Hou, Margaret Li, Bill Byrne||4 views
🤖AI Summary

Researchers introduce ATM-Bench, the first benchmark for evaluating AI assistants' ability to recall and reason over long-term personalized memory across multiple modalities. The benchmark reveals poor performance (under 20% accuracy) for current state-of-the-art memory systems, highlighting significant limitations in personalized AI capabilities.

Key Takeaways
  • ATM-Bench is the first multimodal benchmark for testing AI assistants' long-term personalized memory capabilities across images, videos, and emails.
  • Current state-of-the-art memory systems achieve less than 20% accuracy on the challenging ATM-Bench-Hard dataset.
  • The benchmark includes four years of privacy-preserving personal memory data with human-annotated question-answer pairs.
  • Schema-Guided Memory (SGM) outperforms traditional Descriptive Memory approaches used in previous research.
  • The research exposes critical gaps in AI systems' ability to handle personalized references and multi-source reasoning.
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles