AIBearisharXiv – CS AI · 6h ago7/10
🧠
From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors
Researchers reveal a critical vulnerability in LLM agents operating in local workspaces, where attackers can plant hidden prompt injections across multiple steps to gain persistent control. The new ClawTrojan benchmark demonstrates 95.5% attack success rates against GPT-5.4, while a proposed defense mechanism called DASGuard offers runtime protection by tracing and sanitizing potentially malicious control text in sensitive files.
🧠 GPT-5