y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#machine-learning-security News & Analysis

22 articles tagged with #machine-learning-security. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

22 articles
AIBearisharXiv – CS AI · 6d ago7/10
🧠

MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models

Researchers introduce MaskForge, a black-box attack method that exploits structural vulnerabilities in diffusion-based large language models (dLLMs) by leveraging their native masking capabilities. The technique achieves 79.3% average success rates across five models and transfers effectively to other benchmarks, demonstrating a significant security gap in an emerging class of language models distinct from standard autoregressive architectures.

AIBearisharXiv – CS AI · Jun 27/10
🧠

Erased but Not Forgotten: How Backdoors Compromise Concept Erasure

Researchers have discovered a critical vulnerability called Erasure Evasion Backdoor (EEB) that allows adversaries to bypass concept erasure methods in text-to-image diffusion models by binding malicious triggers to concepts marked for removal. The backdoor survives the erasure process across six state-of-the-art methods, achieving up to 94% success rates in exposing harmful content, revealing fundamental weaknesses in current AI safety approaches.

AIBullisharXiv – CS AI · May 287/10
🧠

Disentangling Adversarial Prompts: A Semantic-Graph Defense for Robust LLM Security

Researchers propose the Adversarial Prompt Disentanglement (APD) framework, a defense mechanism that identifies and neutralizes malicious components in LLM inputs before processing. The system combines semantic decomposition, graph-based intent classification, and transformer-based detection to reduce harmful outputs by over 85% while maintaining model performance.

AIBullisharXiv – CS AI · May 97/10
🧠

DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning

DeTrigger is a new federated learning framework that uses gradient analysis to detect and neutralize backdoor attacks in distributed machine learning systems. The approach achieves 251x faster detection than existing methods while mitigating 98.9% of backdoor attacks with minimal accuracy loss, addressing a critical vulnerability in privacy-preserving collaborative AI training.

AIBearisharXiv – CS AI · May 77/10
🧠

On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference

Researchers demonstrate that the shuffling defense mechanism used to protect Transformer model weights during secure inference can be broken through an alignment attack, allowing adversaries to recover weights with minimal cost. The attack exploits multiple shuffled activations by finding a common permutation, undermining a key security assumption in privacy-preserving machine learning.

AIBearisharXiv – CS AI · Apr 207/10
🧠

Power to the Clients: Federated Learning in a Dictatorship Setting

Researchers identify a critical vulnerability in federated learning systems where malicious 'dictator clients' can erase other participants' contributions while preserving their own, compromising the collaborative training process. The study provides theoretical and empirical analysis of single and multiple dictator scenarios, revealing fundamental security weaknesses in decentralized machine learning architectures.

AIBearisharXiv – CS AI · Apr 137/10
🧠

XFED: Non-Collusive Model Poisoning Attack Against Byzantine-Robust Federated Classifiers

Researchers have developed XFED, a novel model poisoning attack that compromises federated learning systems without requiring attackers to communicate or coordinate with each other. The attack successfully bypasses eight state-of-the-art defenses, revealing fundamental security vulnerabilities in FL deployments that were previously underestimated.

AIBullisharXiv – CS AI · Apr 77/10
🧠

CoopGuard: Stateful Cooperative Agents Safeguarding LLMs Against Evolving Multi-Round Attacks

Researchers have developed CoopGuard, a new defense framework that uses cooperative AI agents to protect Large Language Models from sophisticated multi-round adversarial attacks. The system employs three specialized agents coordinated by a central system that maintains defense state across interactions, achieving a 78.9% reduction in attack success rates compared to existing defenses.

AIBullisharXiv – CS AI · 6d ago6/10
🧠

TITAN-FedAnil+: Trust-Based Adaptive Blockchain Federated Learning for Resource-Constrained Intelligent Enterprises

TITAN-FedAnil+ presents a blockchain-based federated learning framework designed to address data privacy and security challenges in resource-constrained enterprise environments. The system uses adaptive clustering and GPU acceleration to filter malicious updates while reducing memory overhead by up to 81%, making secure distributed learning more practical for edge devices.

AINeutralarXiv – CS AI · 6d ago6/10
🧠

Token Rankings are Unforgeable Language Model Signatures

Researchers demonstrate that token ranking signatures from language model APIs are mathematically unforgeable—each model produces unique top-k token orderings that cannot be replicated by other models. While rankings leak less information than raw logits, they still enable approximate parameter theft, though APIs can mitigate this risk by restricting k to sufficiently small values.

AINeutralarXiv – CS AI · Jun 26/10
🧠

SORA: Free Second-Order Attacks in Fast Adversarial Training

Researchers introduce SORA, a new adversarial training method that addresses catastrophic overfitting in fast neural network defense systems. By leveraging perturbation variability and a novel gradient alignment metric, SORA achieves state-of-the-art robustness against adversarial attacks while maintaining higher clean accuracy with improved computational efficiency.

AINeutralarXiv – CS AI · Jun 26/10
🧠

GJDNet: Robust Graph Neural Networks via Joint Disentangled Learning Against Adversarial Attacks

Researchers propose GJDNet, a robust Graph Neural Network defense framework that protects against adversarial attacks by jointly disentangling node representations and decision spaces. The approach addresses vulnerabilities in GNNs caused by adversarial perturbations that invert graph connectivity patterns, achieving improved robustness across different graph types.

AINeutralarXiv – CS AI · May 296/10
🧠

Quantum-Enhanced Adversarial Robustness in Artificial Intelligence

Researchers present a comprehensive framework exploring how quantum computing techniques can enhance artificial intelligence's resilience against adversarial attacks. The work addresses a critical vulnerability in modern AI systems—their susceptibility to carefully crafted perturbations—by proposing quantum-enhanced defense mechanisms through optimization, feature mapping, and hybrid architectures.

AINeutralarXiv – CS AI · May 286/10
🧠

Mind the Gap: Mixtures of Gaussians in Approximate Differential Privacy

Researchers introduce mixture mechanisms for differential privacy that combine multiple Gaussian distributions to reduce noise in data queries while maintaining privacy guarantees. These mechanisms substantially outperform existing analytic Gaussian approaches in low-privacy regimes, approaching theoretical optimality with significantly lower noise amplitudes and variances.

AINeutralarXiv – CS AI · May 276/10
🧠

Practical Anonymous Two-Party Gradient Boosting Decision Tree

Researchers introduce an anonymous gradient-boosted decision tree (GBDT) protocol enabling secure training on vertically partitioned data between two parties while hiding record identifiers. The approach uses dual circuit-PSI and oblivious pseudorandom functions to eliminate ID exposure risks inherent in standard private set intersection methods, while achieving computational efficiency comparable to non-private approaches.

AINeutralarXiv – CS AI · May 276/10
🧠

Assessing Per-Sample Membership Inference Vulnerability without Retraining

Researchers propose a novel method to assess individual training data vulnerability to membership inference attacks without requiring shadow models. The approach combines theoretical analysis in linear settings with a practical surrogate score for deep networks, using only geometry and loss information from a single trained model.

AIBullisharXiv – CS AI · May 126/10
🧠

A Robust Out-of-Distribution Detection Framework via Synergistic Smoothing

Researchers introduce ROSS, a robust out-of-distribution detection framework that combines median smoothing with instability quantification to defend machine learning systems against adversarial attacks. The method achieves state-of-the-art performance by leveraging the observation that OOD samples exhibit higher instability under perturbations, outperforming prior defenses by up to 40 AUROC points.

AINeutralarXiv – CS AI · May 116/10
🧠

A Statistical Framework for Algorithmic Collective Action with Multiple Collectives

Researchers propose the first statistical framework for Algorithmic Collective Action (ACA) involving multiple independent collectives attempting to coordinate changes in shared data to influence AI classifier behavior. The framework provides computable bounds on collective success while accounting for varying sizes, strategies, and goal alignment across groups, with applications to climate adaptation in smart cities.

AINeutralarXiv – CS AI · May 116/10
🧠

Towards Differentially Private Reinforcement Learning with General Function Approximation

Researchers present the first theoretical framework for differentially private reinforcement learning with general function approximation, achieving regret bounds of Õ(K^3/5) that match linear-case performance. This breakthrough extends privacy guarantees beyond tabular and linear settings, combining batched policy updates with the exponential mechanism for improved privacy-utility tradeoffs in online RL systems.

AINeutralarXiv – CS AI · May 16/10
🧠

AdaBFL: Multi-Layer Defensive Adaptive Aggregation for Bzantine-Robust Federated Learning

Researchers propose AdaBFL, a Byzantine-robust federated learning method that uses adaptive multi-layer defense mechanisms to protect distributed machine learning systems from poisoning attacks by malicious clients. The approach balances defense against multiple attack types without requiring server-side dataset access, with proven convergence properties on non-IID data.

AIBullisharXiv – CS AI · Apr 146/10
🧠

QShield: Securing Neural Networks Against Adversarial Attacks using Quantum Circuits

Researchers introduce QShield, a hybrid quantum-classical neural network architecture that combines traditional CNNs with quantum processing modules to defend deep learning models against adversarial attacks. Testing on MNIST, OrganAMNIST, and CIFAR-10 datasets shows the hybrid approach maintains accuracy while substantially reducing attack success rates and increasing computational costs for adversaries.

AINeutralarXiv – CS AI · Apr 136/10
🧠

CLIP-Inspector: Model-Level Backdoor Detection for Prompt-Tuned CLIP via OOD Trigger Inversion

Researchers introduce CLIP-Inspector, a backdoor detection method for prompt-tuned CLIP models that reconstructs hidden triggers using out-of-distribution images to identify if a model has been maliciously compromised. The technique achieves 94% detection accuracy and enables post-hoc model repair, addressing critical security vulnerabilities in outsourced machine learning services.