Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction
Researchers present MemPoison, a novel attack that exploits vulnerabilities in large language model agents by injecting malicious information into their long-term memory through dialogue interactions. The attack achieves up to 95% success rates by using semantic bridges, entity masquerading, and embedding optimization to bypass modern selective memory mechanisms, revealing critical security gaps in autonomous AI systems.