It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents
Researchers introduce TRAP, a benchmark demonstrating that web-based AI agents are vulnerable to prompt injection attacks hidden in interface elements, with susceptibility rates ranging from 13% to 43% across frontier models. The study reveals that small contextual changes can double attack success rates, exposing systemic security weaknesses in autonomous agents performing real-world tasks like email management and professional networking.