10 articles tagged with #ai-vulnerabilities. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBearisharXiv – CS AI · Mar 277/10
🧠Research reveals that LLM system prompt configuration creates massive security vulnerabilities, with the same model's phishing detection rates ranging from 1% to 97% based solely on prompt design. The study PhishNChips demonstrates that more specific prompts can paradoxically weaken AI security by replacing robust multi-signal reasoning with exploitable single-signal dependencies.
AINeutralOpenAI News · Mar 257/10
🧠OpenAI has launched a Safety Bug Bounty program designed to identify and address AI safety risks and potential abuse vectors. The program specifically targets vulnerabilities including agentic risks, prompt injection attacks, and data exfiltration threats.
🏢 OpenAI
AIBearisharXiv – CS AI · Mar 167/10
🧠Researchers have released MalURLBench, the first benchmark to evaluate how LLM-based web agents handle malicious URLs, revealing significant vulnerabilities across 12 popular models. The study found that existing AI agents struggle to detect disguised malicious URLs and proposed URLGuard as a defensive solution.
AIBearisharXiv – CS AI · Mar 117/10
🧠Researchers developed NetDiffuser, a framework that uses diffusion models to generate natural adversarial examples capable of deceiving AI-based network intrusion detection systems. The system achieved up to 29.93% higher attack success rates compared to baseline attacks, highlighting significant vulnerabilities in current deep learning-based security systems.
AIBearisharXiv – CS AI · Mar 37/103
🧠Researchers have developed a new 'untargeted jailbreak attack' (UJA) that can compromise AI safety systems in large language models with over 80% success rate using only 100 optimization iterations. This gradient-based attack method expands the search space by maximizing unsafety probability without fixed target responses, outperforming existing attacks by over 30%.
AIBearishThe Register – AI · Mar 256/10
🧠The article title suggests a new type of AI supply chain attack that doesn't require traditional malware, instead using poisoned documentation as the attack vector. However, no article body content was provided for analysis.
AIBearishThe Register – AI · Mar 47/10
🧠Research reveals that AI-powered medical assistant systems can be easily manipulated to change prescriptions and provide harmful medical advice. The study highlights significant vulnerabilities in AI healthcare tools that could pose serious risks to patient safety.
AIBearisharXiv – CS AI · Mar 37/107
🧠Researchers have developed CaptionFool, a universal adversarial attack that can manipulate AI image captioning models by modifying just 1.2% of image patches. The attack achieves 94-96% success rates in forcing models to generate arbitrary captions, including offensive content that can bypass content moderation systems.
AIBearishIEEE Spectrum – AI · Jan 216/105
🧠Large language models (LLMs) remain highly vulnerable to prompt injection attacks where specific phrasing can override safety guardrails, causing AI systems to perform forbidden actions or reveal sensitive information. Unlike humans who use contextual judgment and layered defenses, current LLMs lack the ability to assess situational appropriateness and cannot universally prevent such attacks.
AIBearishOpenAI News · Feb 246/105
🧠Adversarial examples are specially crafted inputs designed to fool machine learning models into making incorrect predictions, functioning like optical illusions for AI systems. The article explores how these attacks work across different mediums and highlights the challenges in defending ML systems against such vulnerabilities.