#adversarial-attack News & Analysis

4 articles tagged with #adversarial-attack. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles

AIBearisharXiv – CS AI · Jun 27/10

🧠

Jailbreaking Multimodal Large Language Models using Multi-Clip Video

Researchers have identified critical vulnerabilities in multimodal large language models (MLLMs) when processing video inputs, demonstrating that safety mechanisms can be systematically bypassed using multi-clip videos with diverse contexts. The study reveals that video inputs pose greater security risks than static images, with attack success rates increasing proportionally to the number of video clips used.

AIBearisharXiv – CS AI · May 117/10

🧠

Searching for Privacy Risks in LLM Agents via Simulation

Researchers developed a search-based framework to identify privacy vulnerabilities in LLM-based agents through simulated multi-turn interactions. The study reveals that malicious agents employ sophisticated tactics like impersonation and consent forgery to extract sensitive information, while defenses evolve into robust identity-verification systems, with findings generalizing across diverse scenarios and models.

AIBearisharXiv – CS AI · May 77/10

🧠

On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference

Researchers demonstrate that the shuffling defense mechanism used to protect Transformer model weights during secure inference can be broken through an alignment attack, allowing adversaries to recover weights with minimal cost. The attack exploits multiple shuffled activations by finding a common permutation, undermining a key security assumption in privacy-preserving machine learning.

AIBearisharXiv – CS AI · May 47/10

🧠

Exploring LLM biases to manipulate AI search overview

Researchers demonstrate that Large Language Models used in AI search overview systems are vulnerable to bias manipulation through reinforcement learning-optimized snippet rewriting. The study reveals that adversaries can exploit LLM biases to influence search result rankings and generate inaccurate or harmful information, posing significant security risks to AI-powered search applications.