y0news
← Feed
←Back to feed
🧠 AI🟒 Bullish

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

arXiv – CS AI|Justin W. Lin, Eliot Krzysztof Jones, Donovan Julian Jasper, Ethan Jun-shen Ho, Anna Wu, Arnold Tianyi Yang, Neil Perry, Andy Zou, Matt Fredrikson, J. Zico Kolter, Percy Liang, Dan Boneh, Daniel E. Ho||1 views
πŸ€–AI Summary

Researchers conducted the first comprehensive evaluation comparing AI agents to human cybersecurity professionals in live penetration testing on a university network with 8,000 hosts. The new ARTEMIS AI agent framework placed second overall, discovering 9 vulnerabilities with 82% accuracy and outperforming 9 of 10 human participants while costing significantly less at $18/hour versus $60/hour for human testers.

Key Takeaways
  • β†’ARTEMIS AI agent outperformed 9 out of 10 human cybersecurity professionals in real-world penetration testing.
  • β†’AI agents demonstrated cost advantages at $18/hour compared to $60/hour for professional penetration testers.
  • β†’ARTEMIS achieved an 82% valid submission rate while discovering 9 valid vulnerabilities in the enterprise environment.
  • β†’AI agents excelled at systematic enumeration and parallel exploitation but struggled with GUI-based tasks and had higher false-positive rates.
  • β†’Existing AI scaffolds like Codex and CyAgent underperformed compared to most human participants, highlighting the advancement of ARTEMIS.
Read Original β†’via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β€” you keep full control of your keys.
Connect Wallet to AI β†’How it works
Related Articles