🧠 AI⚪ NeutralImportance 6/10

AXE: Grey-Box Exploitability Confirmation for Localized Vulnerability Reports

arXiv – CS AI|Amirali Sajadi, Tu Nguyen, Kostadin Damevski, Preetha Chatterjee|June 23, 2026 at 04:00 AM

🤖AI Summary

AXE, a multi-agent AI framework, improves vulnerability exploitation detection by leveraging minimal metadata like CWE classifications and code locations, achieving 30% success rates—3x better than existing black-box approaches. The system generates actionable proof-of-concept exploits to help software maintainers validate and prioritize security findings more efficiently.

Analysis

AXE addresses a critical pain point in modern software security: vulnerability detection tools generate numerous alerts that overwhelm development teams, many of which are false positives or lack sufficient context for remediation. By bridging the gap between automated detection and exploitation validation, this framework transforms how organizations triage security vulnerabilities in web applications.

The research builds on established trends in AI-assisted security, where automated systems increasingly handle repetitive technical tasks. Unlike previous approaches that operate independently, AXE integrates with detection pipelines using readily available metadata—specifically CWE classifications and source code locations—making it practical for real-world deployment. The framework's multi-agent architecture decouples planning, code exploration, and dynamic execution, allowing specialized components to handle different exploitation challenges efficiently.

The 3x performance improvement over black-box baselines demonstrates substantial value for development teams struggling with alert fatigue. By confirming whether vulnerabilities are actually exploitable, AXE enables prioritization based on genuine risk rather than theoretical classification. Organizations can allocate remediation resources more effectively, reducing both security exposure and wasted effort on false positives. The framework's ability to generate reproducible proof-of-concept artifacts further accelerates patch validation and deployment.

Systematic error analysis revealing reasoning gaps around vulnerability semantics and execution preconditions suggests opportunities for continued improvement. Future iterations might benefit from fine-tuned models trained specifically on vulnerability exploitation patterns. As supply chain security concerns intensify and development velocity accelerates, automated vulnerability triage systems like AXE will likely become standard infrastructure in enterprise security operations.

Key Takeaways

→AXE achieves 30% exploitation success rate using grey-box metadata, representing a 3x improvement over existing black-box approaches
→The multi-agent framework integrates with detection pipelines using only CWE classifications and code locations, making deployment practical
→Framework generates reproducible proof-of-concept exploits that accelerate vulnerability validation and remediation prioritization
→Error analysis shows reasoning gaps remain, particularly around vulnerability semantics and execution preconditions
→Real-world case study demonstrates generalizability beyond academic CVE-Bench dataset evaluation