#vulnerability-assessment News & Analysis

13 articles tagged with #vulnerability-assessment. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

13 articles

AIBullisharXiv – CS AI · 16h ago7/10

🧠

CVE-Factory: Scaling Expert-Level Agentic Tasks for Code Security Vulnerability

CVE-Factory is an automated multi-agent framework that transforms vulnerability metadata into executable security tasks with expert-level quality, achieving 95% correctness and enabling the creation of LiveCVEBench—a continuously updated benchmark of 190 security tasks across 14 programming languages that advances AI code security evaluation.

🧠 Claude

AIBearisharXiv – CS AI · 3d ago7/10

🧠

How Reliable Are AI Attackers Against a Fixed Vulnerable Target? A 400-Run Empirical Study of LLM Penetration Testing Consistency

Researchers conducted 400 autonomous penetration testing runs across four LLM models against a fixed vulnerable target to measure attack consistency. Results show significant variation in exploitation success rates (25-85%) and distinctive failure modes per model, with Claude and Gemini 2.5 Flash-Lite substantially outperforming GPT-4o-mini and Qwen, raising critical questions about LLM reliability in security-critical autonomous operations.

🏢 Anthropic🧠 GPT-4🧠 Claude

AIBearisharXiv – CS AI · 3d ago7/10

🧠

SafeSearch: Automated Red-Teaming of LLM-Based Search Agents

Researchers introduce SafeSearch, an automated red-teaming framework that identifies critical vulnerabilities in LLM-based search agents by testing them against 300 adversarial cases spanning misinformation, prompt injection, and other risks. The study reveals that current search agents achieve attack success rates up to 90.5%, with common defenses like reminder prompting providing minimal protection.

🧠 GPT-4

AIBearisharXiv – CS AI · 4d ago7/10

🧠

Technical Report: Exploring the Emerging Threats of the Agent Skill Ecosystem

Researchers identified 76 confirmed malicious AI agent skills across major marketplaces, with 13.4% of 3,984 analyzed skills containing critical security vulnerabilities. The findings highlight urgent risks as AI agents gain access to sensitive credentials and systems, with malicious payloads still publicly available on platforms like clawhub.ai.

AIBearisharXiv – CS AI · May 77/10

🧠

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

Researchers introduce DecodingTrust-Agent Platform (DTap), a red-teaming framework designed to systematically test AI agent vulnerabilities across 14 real-world domains. The platform includes an autonomous red-teaming agent (DTap-Red) that discovers attack strategies and a benchmarking dataset, revealing critical security gaps in popular AI agents that could enable API key theft, unauthorized transactions, and data deletion.

AIBearisharXiv – CS AI · Apr 147/10

🧠

Persona Non Grata: Single-Method Safety Evaluation Is Incomplete for Persona-Imbued LLMs

Researchers demonstrate that safety evaluations of persona-imbued large language models using only prompt-based testing are fundamentally incomplete, as activation steering reveals entirely different vulnerability profiles across model architectures. Testing across four models reveals the 'prosocial persona paradox' where conscientious personas safe under prompting become the most vulnerable to activation steering attacks, indicating that single-method safety assessments can miss critical failure modes.

🧠 Llama

AIBearisharXiv – CS AI · Apr 67/10

🧠

A Systematic Security Evaluation of OpenClaw and Its Variants

A comprehensive security evaluation of six OpenClaw-series AI agent frameworks reveals substantial vulnerabilities across all tested systems, with agentized systems proving significantly riskier than their underlying models. The study identified reconnaissance and discovery behaviors as the most common weaknesses, while highlighting that security risks are amplified through multi-step planning and runtime orchestration capabilities.

AIBullisharXiv – CS AI · Mar 47/102

🧠

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

Researchers conducted the first comprehensive evaluation comparing AI agents to human cybersecurity professionals in live penetration testing on a university network with 8,000 hosts. The new ARTEMIS AI agent framework placed second overall, discovering 9 vulnerabilities with 82% accuracy and outperforming 9 of 10 human participants while costing significantly less at $18/hour versus $60/hour for human testers.

AINeutralarXiv – CS AI · May 126/10

🧠

From Controlled to the Wild: Evaluation of Pentesting Agents for the Real-World

Researchers present a new evaluation protocol for AI pentesting agents that moves beyond simplified benchmarks to assess real-world vulnerability discovery capabilities. The framework combines structured ground-truth validation with LLM-based semantic matching and includes efficiency metrics, addressing a critical gap in how offensive security AI systems are currently measured.

AINeutralarXiv – CS AI · Apr 106/10

🧠

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

Researchers introduced SkillSieve, a three-layer detection framework that identifies malicious AI agent skills in OpenClaw's ClawHub marketplace, where 13-26% of over 13,000 skills contain security vulnerabilities. The system combines regex/AST scanning, LLM-based analysis with parallel sub-tasks, and multi-LLM voting to achieve 0.800 F1 score at $0.006 per skill, significantly outperforming existing detection methods.

AINeutralarXiv – CS AI · Mar 96/10

🧠

ESAA-Security: An Event-Sourced, Verifiable Architecture for Agent-Assisted Security Audits of AI-Generated Code

Researchers have developed ESAA-Security, a new architecture for conducting secure, verifiable audits of AI-generated code using structured agent workflows rather than unstructured LLM conversations. The system creates an immutable audit trail through event-sourcing and produces comprehensive security reports across 26 tasks and 95 executable checks.

AIBullisharXiv – CS AI · Mar 36/109

🧠

AWE: Adaptive Agents for Dynamic Web Penetration Testing

Researchers introduced AWE, a memory-augmented multi-agent framework for autonomous web penetration testing that outperforms existing tools on injection vulnerabilities. AWE achieved 87% XSS success and 66.7% blind SQL injection success on benchmark tests, demonstrating superior accuracy and efficiency compared to general-purpose AI penetration testing tools.

AINeutralOpenAI News · Nov 215/102

🧠

Advancing red teaming with people and AI

The article discusses advancements in red teaming methodologies that combine human expertise with artificial intelligence capabilities. This represents a significant development in cybersecurity practices and AI safety testing approaches.