y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

ChainCaps: Composition-Safe Tool-Using Agents via Monotonic Capability Attenuation

arXiv – CS AI|Xiaochong Jiang, Shiqi Yang, Ziwei Li, Lifei Liu, Haoran Yu, Yichen Liu|
🤖AI Summary

Researchers present ChainCaps, a runtime safety framework that prevents tool-using AI agents from exploiting composed services through 'permission laundering'—where an agent passes intermediate results through multiple tools to achieve unauthorized outcomes. The system uses capability budgets that propagate through tool chains via intersection, reducing attack success rates from 25-68% to 0-4.8% while maintaining 96-100% benign task completion across frontier models.

Analysis

ChainCaps addresses a critical vulnerability in modern AI agent architectures where safety checks at individual tool boundaries fail to prevent unsafe end-to-end compositions. The research identifies permission laundering as a distinct failure mode: an agent satisfies each tool's permission requirements independently but achieves dangerous outcomes through chaining—such as reading confidential data, summarizing it, and exfiltrating the summary. This gap exists because current security models evaluate tools in isolation rather than considering information flow across composed services.

The framework operates transparently as an MCP proxy, requiring no modifications to agents or tool implementations. By assigning sink-specific capability budgets to values and propagating them through intersection during composition, ChainCaps ensures that authority can only decrease or remain constant, never increase, through tool chains. Testing across 82 tasks from OpenAI, Anthropic, and Google models demonstrates substantial attack mitigation with minimal impact on legitimate operations.

The deployment challenge revealed by the research is stark: expert-curated manifests achieve 100% attack blocking while naive manifests fail on 72.7% of attacks. This highlights that technical solutions alone are insufficient—proper capability specification and maintenance become critical operational requirements. The framework's acknowledged limitations regarding explicit-flow visibility and trusted manifest assumptions define its current scope. For enterprise deployments integrating multiple tool ecosystems, this represents a meaningful step toward safer agent composition, though manifest quality emerges as the dominant constraint on real-world effectiveness.

Key Takeaways
  • ChainCaps reduces AI agent composition attacks from 25-68% success to 0-4.8% using capability budget propagation through tool chains.
  • Permission laundering enables unsafe multi-step attacks where each individual tool satisfies security checks but the composition produces dangerous outcomes.
  • The framework operates as a transparent proxy requiring zero changes to existing agents or tool servers, improving deployment viability.
  • Manifest quality is the primary bottleneck: expert manifests block 100% of attacks while naive manifests fail on 72.7% of test cases.
  • The approach maintains 96-100% benign task completion rates, demonstrating minimal overhead on legitimate agent operations.
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles