y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#vulnerability-detection News & Analysis

16 articles tagged with #vulnerability-detection. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

16 articles
AINeutralCrypto Briefing · 5d ago7/10
🧠

Brad Gerstner: Detachment from desires fosters personal achievement, Anthropic’s Mythos reveals critical vulnerabilities, and proactive AI measures are essential for cybersecurity | All-In Podcast

Brad Gerstner discussed Anthropic's AI model discoveries on the All-In Podcast, highlighting how advanced AI systems are exposing critical software vulnerabilities before they become widely exploited. The findings underscore the urgent need for companies to implement proactive cybersecurity measures as AI capabilities accelerate toward mainstream adoption.

Brad Gerstner: Detachment from desires fosters personal achievement, Anthropic’s Mythos reveals critical vulnerabilities, and proactive AI measures are essential for cybersecurity | All-In Podcast
🏢 Anthropic
AIBullisharXiv – CS AI · Mar 267/10
🧠

OSS-CRS: Liberating AIxCC Cyber Reasoning Systems for Real-World Open-Source Security

Researchers have created OSS-CRS, an open framework that makes DARPA's AI Cyber Challenge systems usable for real-world cybersecurity applications. The system successfully ported the winning Atlantis CRS and discovered 10 previously unknown bugs, including three high-severity issues, across 8 open-source projects.

AI × CryptoBullisharXiv – CS AI · Mar 177/10
🤖

Benchmarking Zero-Shot Reasoning Approaches for Error Detection in Solidity Smart Contracts

Researchers benchmarked state-of-the-art LLMs for detecting vulnerabilities in Solidity smart contracts using zero-shot prompting strategies. The study found that Chain-of-Thought and Tree-of-Thought approaches significantly improved recall (95-99%) but reduced precision, while Claude 3 Opus achieved the best performance with a 90.8 F1-score in vulnerability classification.

🧠 Claude
AIBearisharXiv – CS AI · Mar 47/103
🧠

ZeroDayBench: Evaluating LLM Agents on Unseen Zero-Day Vulnerabilities for Cyberdefense

Researchers introduced ZeroDayBench, a new benchmark testing LLM agents' ability to find and patch 22 critical vulnerabilities in open-source code. Testing on frontier models GPT-5.2, Claude Sonnet 4.5, and Grok 4.1 revealed that current LLMs cannot yet autonomously solve cybersecurity tasks, highlighting limitations in AI-powered code security.

AIBullisharXiv – CS AI · Feb 277/105
🧠

Automated Vulnerability Detection in Source Code Using Deep Representation Learning

Researchers developed a convolutional neural network model that can automatically detect vulnerabilities in C source code using deep learning techniques. The model was trained on datasets from Draper Labs and NIST, achieving higher recall than previous work while maintaining high precision and demonstrating effectiveness on real Linux kernel vulnerabilities.

AI × CryptoBullishThe Defiant · Feb 187/106
🤖

OpenAI Unveils AI Benchmark Tool to Enhance Blockchain Security

OpenAI has partnered with Paradigm to launch EVMbench, a new AI benchmark tool designed to evaluate artificial intelligence agents' capabilities in detecting, patching, and exploiting smart contract vulnerabilities. This tool represents a significant step forward in using AI to enhance blockchain security infrastructure.

OpenAI Unveils AI Benchmark Tool to Enhance Blockchain Security
AI × CryptoBullishBankless · Feb 187/105
🤖

OpenAI and Paradigm Introduce 'EVMbench' for AI Agent Benchmarking

OpenAI and Paradigm have launched EVMbench, a new benchmarking tool designed to evaluate AI agents' capabilities in detecting, exploiting, and patching high-severity smart contract vulnerabilities. This represents a significant step toward using AI for automated smart contract security auditing and vulnerability management.

AI × CryptoBullishOpenAI News · Feb 187/108
🤖

Introducing EVMbench

OpenAI and Paradigm have launched EVMbench, a new benchmark tool designed to evaluate AI agents' capabilities in detecting, patching, and exploiting high-severity vulnerabilities in smart contracts. This collaboration represents a significant step toward improving smart contract security through AI-powered analysis tools.

AIBullishOpenAI News · Oct 307/106
🧠

Introducing Aardvark: OpenAI’s agentic security researcher

OpenAI has launched Aardvark, an AI-powered autonomous security researcher that can find, validate, and help fix software vulnerabilities at scale. The system is currently in private beta with early testing available through sign-up.

AINeutralcrypto.news · 5d ago6/10
🧠

AI Cybersecurity Race: OpenAI Finalizes Product While Anthropic Runs Project Glasswing to Hunt Critical Vulnerabilities

OpenAI and Anthropic are escalating competition in AI-powered cybersecurity, with OpenAI finalizing a commercial security product for limited partner deployment while Anthropic operates Project Glasswing, a controlled initiative focused on discovering critical software vulnerabilities. This competitive race signals that both AI labs view cybersecurity as a strategically important application area with commercial and defensive value.

AI Cybersecurity Race: OpenAI Finalizes Product While Anthropic Runs Project Glasswing to Hunt Critical Vulnerabilities
🏢 OpenAI🏢 Anthropic
AIBullishOpenAI News · Mar 65/10
🧠

Codex Security: now in research preview

Codex Security, an AI-powered application security agent, has launched in research preview to help developers detect, validate, and patch complex vulnerabilities. The tool analyzes project context to provide more accurate security assessments with reduced false positives.

AIBullisharXiv – CS AI · Mar 37/1010
🧠

Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision

Researchers developed a new inference-time safety mechanism for code-generating AI models that uses retrieval-augmented generation to identify and fix security vulnerabilities in real-time. The approach leverages Stack Overflow discussions to guide AI code revision without requiring model retraining, improving security while maintaining interpretability.

AI × CryptoBearishCoinTelegraph – AI · Mar 37/107
🤖

OpenZeppelin finds data contamination in OpenAI’s EVMbench

OpenZeppelin discovered significant flaws in OpenAI's EVMbench dataset, including data contamination from training leaks and at least four incorrectly classified high-severity vulnerabilities. This finding raises concerns about the reliability of AI tools used for blockchain security auditing.

OpenZeppelin finds data contamination in OpenAI’s EVMbench
AIBullisharXiv – CS AI · Mar 27/1015
🧠

Learning to Generate Secure Code via Token-Level Rewards

Researchers have developed Vul2Safe, a new framework for generating secure code using large language models, which addresses security vulnerabilities through self-reflection and token-level reinforcement learning. The approach introduces the PrimeVul+ dataset and SRCode training framework to provide more precise optimization of security patterns in code generation.

AIBullisharXiv – CS AI · Mar 26/1012
🧠

Enhancing Continual Learning for Software Vulnerability Prediction: Addressing Catastrophic Forgetting via Hybrid-Confidence-Aware Selective Replay for Temporal LLM Fine-Tuning

Researchers developed Hybrid Class-Aware Selective Replay (Hybrid-CASR), a continual learning method that improves AI-based software vulnerability detection by addressing catastrophic forgetting in temporal scenarios. The method achieved 0.667 Macro-F1 score while reducing training time by 17% compared to baseline approaches on CVE data from 2018-2024.