#agent-vulnerability News & Analysis

7 articles tagged with #agent-vulnerability. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

7 articles

AIBearisharXiv – CS AI · Jun 237/10

🧠

When Compression Becomes an Attack Surface: Black-Box Attacks on Prompt-Compressed LLM Agents

Researchers demonstrate that prompt compression—a technique used to reduce LLM latency and costs—creates a new security vulnerability when processing mixed trusted and untrusted inputs. By strategically perturbing untrusted data before compression, attackers can force compressors to discard critical task information or safety guardrails, achieving 71% attack success rates through a black-box method called COMA.

AIBearisharXiv – CS AI · Jun 27/10

🧠

Adversarial Feeds Steer LLM Agent Decisions Against Their Defaults

Researchers demonstrate that LLM agents' decisions can be systematically manipulated through adversarial feed curation—the ordering and composition of information sources agents consume before acting. Testing on 2,785 decision rollouts across four open-source LLMs, they found feeds can shift genuinely uncertain decisions from 5% to 100% in one direction, though they cannot override firmly held model defaults, revealing a critical safety vulnerability in the upstream ranker layer rather than the model itself.

AIBearisharXiv – CS AI · May 287/10

🧠

Plant, Persist, Trigger: Sleeper Attack on Large Language Model Agents

Researchers have identified a new vulnerability in LLM-based agents called 'Sleeper Attacks,' where adversarial content persists dormant in agent state across multiple interactions before being activated by benign queries. The attack threatens real-world LLM deployments by evading single-interaction detection mechanisms, with testing showing vulnerabilities across seven major language models.

AIBearisharXiv – CS AI · May 277/10

🧠

MemMorph: Tool Hijacking in LLM Agents via Memory Poisoning

Researchers introduce MemMorph, a novel attack method that compromises LLM-driven agents by poisoning their long-term memory modules rather than manipulating tool metadata. The attack achieves up to 85.9% success rates by injecting crafted records disguised as technical facts, exposing a critical security vulnerability in memory-augmented AI systems that existing defenses fail to address.

AIBearisharXiv – CS AI · May 127/10

🧠

WebTrap: Stealthy Mid-Task Hijacking of Browser Agents During Navigation

Researchers have discovered WebTrap, a sophisticated prompt injection attack that can stealthily hijack browser-based AI agents during extended tasks by seamlessly blending malicious instructions with legitimate user goals. The attack maintains system usability while achieving high success rates, exposing critical vulnerabilities in autonomous agent systems that current defense mechanisms cannot adequately address.

AIBearisharXiv – CS AI · May 127/10

🧠

ShadowMerge: A Novel Poisoning Attack on Graph-Based Agent Memory via Relation-Channel Conflicts

Researchers have discovered ShadowMerge, a novel poisoning attack that exploits vulnerabilities in graph-based agent memory systems used by LLM agents. The attack achieves a 93.8% success rate by injecting malicious relations that conflict with benign data, enabling attackers to manipulate agent behavior while evading existing security defenses.

AIBearisharXiv – CS AI · Apr 147/10

🧠

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Researchers have developed ADAM, a novel privacy attack that exploits vulnerabilities in Large Language Model agents' memory systems through adaptive querying, achieving up to 100% success rates in extracting sensitive information. The attack highlights critical security gaps in modern LLM-based systems that rely on memory modules and retrieval-augmented generation, underscoring the urgent need for privacy-preserving safeguards.