AINeutralarXiv – CS AI · 15h ago6/10
🧠
ChainCaps: Composition-Safe Tool-Using Agents via Monotonic Capability Attenuation
Researchers present ChainCaps, a runtime safety framework that prevents tool-using AI agents from exploiting composed services through 'permission laundering'—where an agent passes intermediate results through multiple tools to achieve unauthorized outcomes. The system uses capability budgets that propagate through tool chains via intersection, reducing attack success rates from 25-68% to 0-4.8% while maintaining 96-100% benign task completion across frontier models.