Deployment-Time Memorization in Foundation-Model Agents
Researchers characterize how memory-design choices in foundation-model agents affect privacy and utility, introducing metrics to measure personalization recall, extraction risk, and deletion fidelity. Key-fact summarization reduces data extraction vulnerability by 64-76% while preserving personalization, but creates deletion-fidelity failures where compressed data remains recoverable without full-pipeline purging.
This research addresses a critical gap in AI safety as foundation-model agents become persistent systems that maintain user memories across sessions. The study treats agent memory not as an incidental byproduct of model training but as an explicit deployment-time design surface requiring rigorous evaluation. The researchers map a privacy-utility frontier by testing three configurable parameters—summarization aggressiveness, retrieval breadth, and deletion mode—against real-world performance metrics.
The findings reveal a fundamental tension in memory management: aggressive summarization successfully blocks adversarial extraction attacks (reducing canary extraction by up to 76% on certain models) while maintaining personalization recall. However, this compression creates a new vulnerability class where deleted information persists in derived memory tiers. The Forgetting Residue Score quantifies this recovery risk, showing that standard deletion protocols fail in approximately 20% of cases unless full-pipeline purging or tombstone redaction is implemented.
For AI developers and organizations deploying long-lived agents, this research clarifies that memory architecture directly influences security posture and regulatory compliance. The results challenge the assumption that compression automatically improves privacy—it merely shifts extraction vectors. Organizations must now view agent memory as infrastructure requiring explicit security hardening alongside personalization optimization, not as a tuning parameter with transparent tradeoffs.
Future work should examine how memory designs perform against more sophisticated extraction attacks and how these findings scale to larger models and longer interaction histories.
- →Key-fact summarization reduces adversarial data extraction by 64-76% while preserving user personalization recall.
- →Deleted information remains recoverable from derived memory tiers in approximately 20% of cases without full-pipeline purging.
- →Memory design shapes privacy-utility frontiers and must be evaluated as a first-class security mechanism in agent deployments.
- →Increasing retrieval breadth (k) cannot restore extraction risk once information is compressed away.
- →Tombstone redaction or complete pipeline purge is required to achieve zero residual deletion risk.