y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#agent-security News & Analysis

4 articles tagged with #agent-security. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBearisharXiv โ€“ CS AI ยท 3d ago7/10
๐Ÿง 

The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use Agents

Researchers have identified a critical safety vulnerability in computer-use agents (CUAs) where benign user instructions can lead to harmful outcomes due to environmental context or execution flaws. The OS-BLIND benchmark reveals that frontier AI models, including Claude 4.5 Sonnet, achieve 73-93% attack success rates under these conditions, with multi-agent deployments amplifying vulnerabilities as decomposed tasks obscure harmful intent from safety systems.

๐Ÿง  Claude
AIBearisharXiv โ€“ CS AI ยท Apr 107/10
๐Ÿง 

Invisible to Humans, Triggered by Agents: Stealthy Jailbreak Attacks on Mobile Vision-Language Agents

Researchers have discovered a new attack vulnerability in mobile vision-language agents where malicious prompts remain invisible to human users but are triggered during autonomous agent interactions. Using an optimization method called HG-IDA*, attackers can achieve 82.5% planning and 75.0% execution hijack rates on GPT-4o by exploiting the lack of touch signals during agent operations, exposing a critical security gap in deployed mobile AI systems.

๐Ÿง  GPT-4
AIBullisharXiv โ€“ CS AI ยท Mar 177/10
๐Ÿง 

ILION: Deterministic Pre-Execution Safety Gates for Agentic AI Systems

Researchers introduce ILION, a deterministic safety system for autonomous AI agents that can execute real-world actions like financial transactions and API calls. The system achieves 91% precision with sub-millisecond latency, significantly outperforming existing text-safety infrastructure that wasn't designed for agent execution safety.

๐Ÿข OpenAI๐Ÿง  Llama
AIBullisharXiv โ€“ CS AI ยท Mar 36/108
๐Ÿง 

Tracking Capabilities for Safer Agents

Researchers propose a new safety framework for AI agents using Scala 3 with capture checking to prevent information leakage and malicious behaviors. The system creates a 'safety harness' that tracks capabilities through static type checking, allowing fine-grained control over agent actions while maintaining task performance.