🧠 AI🔴 BearishImportance 7/10Actionable

MemMorph: Tool Hijacking in LLM Agents via Memory Poisoning

arXiv – CS AI|Xuanye Zhang, Yongsen Zheng, Zhuqin Xu, Kaiyu Zhou, Bowen Shen, Haoran Ou, Tianwei Zhang, Kwok-Yan Lam|May 27, 2026 at 04:00 AM

🤖AI Summary

Researchers introduce MemMorph, a novel attack method that compromises LLM-driven agents by poisoning their long-term memory modules rather than manipulating tool metadata. The attack achieves up to 85.9% success rates by injecting crafted records disguised as technical facts, exposing a critical security vulnerability in memory-augmented AI systems that existing defenses fail to address.

Analysis

MemMorph represents a significant escalation in LLM agent vulnerabilities, shifting attack vectors from easily-auditable tool metadata to the harder-to-detect memory layers that modern agents rely on for decision refinement. As AI systems increasingly adopt persistent memory to improve tool selection through accumulated experience, adversaries can exploit this architectural choice by injecting poisoned records that subtly reshape how agents perceive and respond to operational contexts. The attack's sophistication lies in its disguise mechanism—framing malicious instructions as technical documentation, incident reports, and policy statements makes detection substantially harder than direct tool manipulation.

The research findings carry important implications for AI safety and the deployment of autonomous agents in production environments. With only three injected records achieving 85.9% attack success rates across multiple agent architectures, the attack surface proves disturbingly large and the barrier to exploitation relatively low. The fact that MemMorph outperforms existing baselines by 25% while maintaining effectiveness against representative defenses indicates that current safeguards focus on the wrong layer of the system architecture.

For organizations deploying LLM-based agents—particularly in financial services, infrastructure management, and other high-stakes domains—this research highlights an urgent need to implement memory-level integrity checks alongside traditional input validation. The cross-benchmark testing spanning three memory implementations suggests the vulnerability generalizes across different technical stacks. Development teams should prioritize memory audit trails, anomaly detection in accumulated experience records, and access controls for memory modification as critical security components moving forward.

Key Takeaways

→MemMorph achieves 85.9% attack success rates by poisoning agent memory with just three crafted records
→Memory modules emerge as a critical and previously under-explored attack surface in tool-augmented LLM agents
→The attack disguises malicious instructions as technical facts and policies, evading detection mechanisms targeting tool metadata
→Existing defenses prove insufficient, with MemMorph outperforming strongest baselines by up to 25% across tests
→Organizations deploying autonomous agents need memory-level integrity safeguards and audit trails as urgent security requirements