216 articles tagged with #ai-security. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv – CS AI · Mar 37/107
🧠Researchers introduce ROKA, a new machine unlearning method that prevents knowledge contamination and indirect attacks on AI models. The approach uses 'Neural Healing' to preserve important knowledge while forgetting targeted data, providing theoretical guarantees for knowledge preservation during unlearning.
AIBearisharXiv – CS AI · Mar 36/108
🧠Researchers identified widespread TOCTOU (time of check to time of use) vulnerabilities in browser-use agents, where web pages change between planning and execution phases, potentially causing unintended actions. A study of 10 popular open-source agents revealed these security flaws are common, prompting development of a lightweight mitigation strategy based on pre-execution validation.
AIBearisharXiv – CS AI · Mar 37/107
🧠Researchers have developed CaptionFool, a universal adversarial attack that can manipulate AI image captioning models by modifying just 1.2% of image patches. The attack achieves 94-96% success rates in forcing models to generate arbitrary captions, including offensive content that can bypass content moderation systems.
AIBearisharXiv – CS AI · Mar 37/108
🧠Researchers have developed MIDAS, a new jailbreaking framework that successfully bypasses safety mechanisms in Multimodal Large Language Models by dispersing harmful content across multiple images. The technique achieved an 81.46% average attack success rate against four closed-source MLLMs by extending reasoning chains and reducing security attention.
$LINK
AIBearisharXiv – CS AI · Mar 36/107
🧠Researchers have developed HIDE&SEEK (HS), a new attack method that can effectively remove watermarks from machine-generated images while maintaining visual quality. This research exposes vulnerabilities in current state-of-the-art proactive image watermarking defenses, highlighting the ongoing arms race between watermarking protection and removal techniques.
AIBullisharXiv – CS AI · Mar 37/106
🧠Researchers have developed AloePri, the first privacy-preserving LLM inference method designed for industrial applications. The system uses collaborative obfuscation to protect input/output data while maintaining 96.5-100% accuracy and resisting state-of-the-art attacks, successfully tested on a 671B parameter model.
AIBearisharXiv – CS AI · Mar 37/108
🧠Researchers have identified significant privacy risks in Large Language Model-based Task-Oriented Dialogue Systems, demonstrating that these AI systems can memorize and leak sensitive training data including phone numbers and complete dialogue exchanges. The study proposes new attack methods that can extract thousands of training dialogue states with over 70% precision in best-case scenarios.
$RNDR
AINeutralarXiv – CS AI · Mar 36/104
🧠Researchers have developed AQUA, the first watermarking framework designed to protect image copyright in Multimodal Retrieval-Augmented Generation (RAG) systems. The framework addresses a critical gap in protecting visual content within RAG-as-a-Service platforms by embedding semantic signals into synthetic images that survive the retrieval-to-generation process.
AI × CryptoBearishCoinTelegraph – AI · Mar 37/107
🤖OpenZeppelin discovered significant flaws in OpenAI's EVMbench dataset, including data contamination from training leaks and at least four incorrectly classified high-severity vulnerabilities. This finding raises concerns about the reliability of AI tools used for blockchain security auditing.
AIBearishTechCrunch – AI · Mar 27/109
🧠Tech workers have signed an open letter calling on the Department of Defense to withdraw its designation of AI company Anthropic as a "supply chain risk" and resolve the issue through quiet negotiations instead.
CryptoNeutralNewsBTC · Mar 26/104
⛓️February 2026 saw crypto hack losses drop to just $26.5 million across 15 incidents, representing a 69% decline from January and the lowest monthly figure in 11 months. Two major attacks on YieldBlox ($10M) and IoTeX ($9M) accounted for over 70% of total losses, while improved security standards and AI-powered monitoring tools are helping reduce vulnerabilities.
$BTC$XRP
AIBullisharXiv – CS AI · Mar 26/1012
🧠Researchers developed Hybrid Class-Aware Selective Replay (Hybrid-CASR), a continual learning method that improves AI-based software vulnerability detection by addressing catastrophic forgetting in temporal scenarios. The method achieved 0.667 Macro-F1 score while reducing training time by 17% compared to baseline approaches on CVE data from 2018-2024.
AIBullisharXiv – CS AI · Mar 27/1013
🧠Researchers developed MI²DAS, a multi-layer intrusion detection framework for Industrial IoT networks that uses incremental learning to adapt to new cyber threats. The system achieved strong performance across multiple layers, with 95.3% accuracy in normal-attack discrimination and robust detection of both known and unknown attacks.
$DAS
AINeutralarXiv – CS AI · Mar 26/1014
🧠Researchers introduce Jailbreak Foundry (JBF), a system that automatically converts AI jailbreak research papers into executable code modules for standardized testing. The system successfully reproduced 30 attacks with high accuracy and reduces implementation code by nearly half while enabling consistent evaluation across multiple AI models.
AINeutralarXiv – CS AI · Mar 27/1010
🧠Researchers introduce Veritas, a multi-modal large language model designed for deepfake detection that uses pattern-aware reasoning to mimic human forensic processes. The system addresses real-world challenges through the HydraFake dataset and achieves significant improvements in detecting unseen forgeries across different domains.
AIBearishOpenAI News · Feb 256/106
🧠A new threat report analyzes how malicious actors are combining AI models with websites and social platforms to carry out attacks. The report examines the implications of these AI-powered threats for detection and defense systems.
AINeutralOpenAI News · Jan 286/105
🧠OpenAI has implemented safeguards to protect user data when AI agents interact with external links, addressing potential security vulnerabilities. The measures focus on preventing URL-based data exfiltration and prompt injection attacks that could compromise user information.
$LINK
AIBearishIEEE Spectrum – AI · Jan 216/105
🧠Large language models (LLMs) remain highly vulnerable to prompt injection attacks where specific phrasing can override safety guardrails, causing AI systems to perform forbidden actions or reveal sensitive information. Unlike humans who use contextual judgment and layered defenses, current LLMs lack the ability to assess situational appropriateness and cannot universally prevent such attacks.
AIBearishOpenAI News · Nov 266/104
🧠OpenAI disclosed a security incident involving their analytics partner Mixpanel that exposed limited API analytics data. The company confirmed that no API content, user credentials, or payment information was compromised in the breach.
AIBullishHugging Face Blog · Oct 226/105
🧠Hugging Face has partnered with VirusTotal to enhance AI model security by integrating malware scanning capabilities. This collaboration aims to protect the AI ecosystem from malicious models and strengthen security protocols across AI platforms.
AIBullishGoogle DeepMind Blog · May 206/105
🧠Google has announced that Gemini 2.5 is their most secure AI model family to date, highlighting enhanced security safeguards. The announcement suggests continued improvements in AI safety and security measures for their flagship language model.
AINeutralOpenAI News · Mar 266/107
🧠OpenAI is implementing comprehensive security measures directly into their infrastructure and models as they progress toward artificial general intelligence (AGI). The company emphasizes proactive adaptation to address security challenges on the path to AGI development.
AINeutralOpenAI News · Jan 225/105
🧠The article discusses research on trading computational resources during inference time to improve adversarial robustness in AI systems. This approach explores how allocating more compute power at inference can enhance model security against adversarial attacks.
AINeutralOpenAI News · Nov 215/102
🧠The article discusses advancements in red teaming methodologies that combine human expertise with artificial intelligence capabilities. This represents a significant development in cybersecurity practices and AI safety testing approaches.
AINeutralOpenAI News · May 286/105
🧠OpenAI has established a new Safety and Security Committee as part of its board structure. This move comes as the AI company continues to scale its operations and address growing concerns about AI safety and security governance.