AIBearisharXiv – CS AI · 9h ago7/10
🧠
PersistBench: When Should Long-Term Memories Be Forgotten by LLMs?
Researchers introduced PersistBench, a benchmark measuring safety risks in large language models equipped with long-term memory capabilities. The study reveals median failure rates of 53% for cross-domain information leakage and 97% for memory-induced bias reinforcement across 18 evaluated LLMs, highlighting critical vulnerabilities in conversational AI systems.