Context-Fractured Decomposition Attacks on Tool-Using LLM Agents: Exploiting Artifact Provenance Gaps
Researchers demonstrate Context-Fractured Decomposition (CFD), a new class of jailbreak attacks against tool-using LLM agents that exploit gaps in artifact provenance tracking across multiple steps and system boundaries. By decomposing harmful requests across time and contexts while maintaining benign-looking intermediate artifacts, CFD achieves up to 28.3% higher success rates than existing attack methods, revealing fundamental vulnerabilities in how AI agents enforce safety guardrails in fragmented deployment environments.
This research exposes a critical vulnerability in how modern LLM agents handle safety enforcement across distributed systems. Most existing jailbreak defenses operate on the assumption of single, contiguous conversations visible to safety monitors. However, real-world agent deployments involve multiple tools, modules, and temporal stages where artifact provenance—the tracking of how data flows through systems—often remains opaque. The CFD attack methodology exploits this fragmentation by breaking harmful requests into individually innocuous steps separated by time and context boundaries, allowing the agent to construct malicious behavior without triggering single-step safeguards.
The research represents a natural evolution in adversarial AI research, following the sophistication trajectory established by multi-turn attacks like Crescendo and Tree of Attacks. Rather than attacking the model directly, CFD targets deployment architecture itself, treating agent pipelines as complex systems with enforcement gaps. This shift from conversation-level attacks to system-level attacks reflects how AI safety challenges scale with real-world complexity.
For organizations deploying tool-using agents in production, this work signals urgent architectural review requirements. The 28.3 percentage point improvement over baselines demonstrates these attacks are substantially more effective than current countermeasures. The proposed mitigation—provenance lineage tagging—establishes a technical direction but requires systemic implementation across entire agent ecosystems to be effective.
Looking ahead, this vulnerability class will likely drive adoption of transparent artifact tracking systems and cross-module safety verification mechanisms. Development teams must implement defense-in-depth strategies that monitor composed actions, not just isolated ones.
- →Context-Fractured Decomposition attacks achieve 28.3% higher jailbreak success rates by exploiting fragmentation in artifact provenance tracking across tool-using LLM agents.
- →Safety defenses designed for single-turn conversations fail against multi-step attacks that preserve benign intermediate artifacts across different system contexts and time periods.
- →Real production deployments with fragmented enforcement across tools and modules create structural vulnerabilities that current benchmarks and defenses inadequately address.
- →Provenance lineage tagging emerges as the primary mitigation direction, requiring systematic implementation across entire agent ecosystems rather than isolated model-level fixes.
- →This attack class targets deployment architecture rather than model vulnerabilities, indicating AI safety challenges scale beyond conversation-level considerations into system design.