#ai-security News & Analysis

216 articles tagged with #ai-security. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

216 articles

AIBearishOpenAI News · Apr 196/105

🧠

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Large Language Models (LLMs) currently face significant security vulnerabilities from prompt injections and jailbreaks, where attackers can override the model's original instructions with malicious prompts. This highlights a critical weakness in current AI systems' ability to maintain instruction integrity and security.

AIBullishHugging Face Blog · Apr 46/108

🧠

Hugging Face partners with Wiz Research to Improve AI Security

Hugging Face has partnered with Wiz Research to enhance AI security measures. This collaboration aims to improve security protocols and protect AI models and datasets on the Hugging Face platform.

AIBullishOpenAI News · Aug 286/107

🧠

Introducing ChatGPT Enterprise

OpenAI announces ChatGPT Enterprise, a new business-focused version of their AI chatbot offering enhanced security, privacy features, and more powerful capabilities. This represents OpenAI's strategic push into the enterprise market with premium AI services.

AIBullishOpenAI News · Jun 16/105

🧠

OpenAI Cybersecurity Grant Program

OpenAI has launched a cybersecurity grant program aimed at supporting the development of AI-powered security capabilities for defensive purposes. The program will provide grants and additional support to facilitate innovation in AI-driven cybersecurity solutions.

AIBullishHugging Face Blog · May 236/105

🧠

🐶Safetensors audited as really safe and becoming the default

The article title suggests Safetensors, a secure file format for machine learning models, has undergone a security audit and is being adopted as the default format. This indicates improved security standards in AI model distribution and storage.

AINeutralOpenAI News · Aug 226/106

🧠

Testing robustness against unforeseen adversaries

Researchers have developed a new method to evaluate neural network classifiers' ability to defend against previously unseen adversarial attacks. The approach introduces the UAR (Unforeseen Attack Robustness) metric to assess model performance against unanticipated threats and emphasizes testing across diverse attack scenarios.

AINeutralOpenAI News · Feb 206/105

🧠

Preparing for malicious uses of AI

A collaborative research paper was published forecasting how malicious actors could misuse AI technology and proposing prevention and mitigation strategies. The year-long research effort involved multiple institutions including the Future of Humanity Institute, Centre for the Study of Existential Risk, and Electronic Frontier Foundation.

AIBearishOpenAI News · Feb 246/105

🧠

Attacking machine learning with adversarial examples

Adversarial examples are specially crafted inputs designed to fool machine learning models into making incorrect predictions, functioning like optical illusions for AI systems. The article explores how these attacks work across different mediums and highlights the challenges in defending ML systems against such vulnerabilities.

AINeutralarXiv – CS AI · Mar 275/10

🧠

NERO-Net: A Neuroevolutionary Approach for the Design of Adversarially Robust CNNs

Researchers developed NERO-Net, a neuroevolutionary approach to design convolutional neural networks with inherent resistance to adversarial attacks without requiring robust training methods. The evolved architecture achieved 47% adversarial accuracy and 93% clean accuracy on CIFAR-10, demonstrating that architectural design can provide intrinsic robustness against adversarial examples.

AINeutralarXiv – CS AI · Mar 175/10

🧠

Privacy-Preserving Explainable AIoT Application via SHAP Entropy Regularization

Researchers developed a privacy-preserving method using SHAP entropy regularization to protect sensitive user data in explainable AI systems for smart home IoT applications. The approach reduces privacy leakage while maintaining model accuracy and explanation quality.

AINeutralarXiv – CS AI · Mar 125/10

🧠

Enhancing Network Intrusion Detection Systems: A Multi-Layer Ensemble Approach to Mitigate Adversarial Attacks

Researchers developed a multi-layer ensemble defense system to protect AI-powered Network Intrusion Detection Systems (NIDS) from adversarial attacks. The solution combines stacking classifiers with autoencoder validation and adversarial training, demonstrating improved resilience against GAN and FGSM-generated attacks on security datasets.

AIBullishHugging Face Blog · Mar 45/105

🧠

Hugging Face and JFrog partner to make AI Security more transparent

The article title mentions a partnership between Hugging Face and JFrog to improve AI security transparency, but no article body content was provided for analysis.

AINeutralOpenAI News · May 34/106

🧠

Transfer of adversarial robustness between perturbation types

The article discusses research on adversarial robustness transfer between different types of perturbations in machine learning models. This research examines how defensive techniques developed for one type of attack may provide protection against other types of adversarial examples.

AINeutralarXiv – CS AI · Mar 24/106

🧠

Concept-based Adversarial Attack: a Probabilistic Perspective

Researchers propose a new concept-based adversarial attack framework that targets entire concept distributions rather than single images, generating diverse adversarial examples while preserving the original concept identity. The method creates adversarial images with variations in pose, viewpoint, or background that can still mislead classifiers while remaining recognizable as instances of the original category.

AINeutralThe Register – AI · Apr 61/10

🧠

Anthropic sure has a mess on its hands thanks to that Claude Code source leak

The article title references a Claude Code source leak affecting Anthropic, but no article body content was provided for analysis. Without the actual article content, specific details about the nature, scope, or implications of this reported leak cannot be determined.

🏢 Anthropic🧠 Claude

AINeutralOpenAI News · Feb 81/106

🧠

Adversarial attacks on neural network policies

The article appears to have no content provided, with only a title about adversarial attacks on neural network policies. Without the actual article body, no meaningful analysis of the research or its implications can be performed.

← PrevPage 9 of 9