🧠 AI🔴 BearishImportance 7/10Actionable

Depth-Dependent Indirect Prompt Injection in Tool-Calling ReAct Agents: Injection Depth, Payload Framing, and Turn-Budget Sensitivity

arXiv – CS AI|Mohammadreza Rashidi|June 1, 2026 at 04:00 AM

🤖AI Summary

Researchers identified that indirect prompt injection attacks against ReAct AI agents succeed at dramatically different rates depending on where malicious payloads appear in tool sequences, with success rates dropping from 60% at the first tool observation to 0% at deeper positions. The study reveals that payload framing and conversation turn limits have minimal impact on attack success, making injection depth the critical vulnerability factor for AI agent systems handling real-world tasks.

Analysis

This research exposes a significant asymmetry in how modern AI agents defend against adversarial inputs, specifically in ReAct systems that combine reasoning with tool-calling. The team systematically tested indirect prompt injection—where attackers embed malicious instructions in tool responses—across multiple dimensions that prior benchmarks ignored. Their findings demonstrate that GPT-4o-mini becomes progressively resistant to injection attacks as payloads appear later in tool sequences, achieving complete failure by depth 4-5, while Claude Haiku exhibits consistent resistance regardless of injection position. This depth-dependency pattern stems from two distinct mechanisms: early resistance where models actively reject malicious instructions, and late-stage completion where agents finish their primary task before encountering the payload. The framing study, testing how rhetorical style affects injection success, showed tantalizing variation between 25% and 75% success rates but lacked statistical power to confirm significance. Turn budgets—the number of interactions allowed—showed no meaningful impact on vulnerability, suggesting agents don't accumulate susceptibility over conversation length. The practical insight that sanitizing only the first tool observation would capture two-thirds of attacks offers developers an immediate hardening strategy, though this may represent a false economy if sophisticated attackers learn to exploit deeper positions. For AI safety practitioners and companies deploying autonomous agents in production environments handling scheduling, file systems, or data access, these findings suggest injection depth is the dominant risk lever requiring careful attention during threat modeling.

Key Takeaways

→Indirect prompt injection success against GPT-4o-mini drops from 60% at depth 1 to 0% at depths 4-5, establishing injection depth as the primary vulnerability factor.
→Claude Haiku demonstrated consistent resistance across all injection depths, suggesting different models have fundamentally different defense mechanisms.
→Payload framing effects varied widely (25-75% success range) but lacked statistical significance at current sample size, requiring larger studies to confirm rhetorical manipulation risk.
→Sanitizing only the first tool observation would mitigate approximately 67% of measured injection attacks, offering developers a straightforward initial hardening approach.
→Turn budget and conversation length showed no correlation with injection success, contradicting assumptions that extended interactions increase AI agent vulnerability.

Mentioned in AI

Models

GPT-4OpenAI

ClaudeAnthropic

#prompt-injection #react-agents #ai-security #adversarial-attacks #tool-calling #gpt-4 #claude #ai-safety #vulnerability #autonomous-agents

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

Depth-Dependent Indirect Prompt Injection in Tool-Calling ReAct Agents: Injection Depth, Payload Framing, and Turn-Budget Sensitivity

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge