#ai-security News & Analysis

Recent coverage of #ai-security remains predominantly skeptical, with nearly half of articles in the past month taking a bearish stance. The 250 indexed articles reflect sustained concern about vulnerabilities and risks as artificial intelligence systems become more prevalent. Anthropic and its Claude model dominate discussions alongside emerging systems like GPT-5, while research from arXiv–CS AI forms the bulk of technical analysis. Sentiment has held relatively stable over the past 90 days, suggesting these security concerns represent ongoing rather than newly emerged challenges. Coverage frequently intersects with #cybersecurity, #machine-learning, #ai-safety, and #adversarial-attacks, indicating security issues span multiple technical domains. Browse the articles below to understand the specific threats and defensive approaches currently under scrutiny.

sentiment · last 30d (86 articles)

Top sources:arXiv – CS AI · 147Crypto Briefing · 10Blockonomi · 8Fortune Crypto · 7The Register – AI · 7

Often co-tagged with:#cybersecurity #machine-learning #ai-safety #adversarial-attacks #anthropic #prompt-injection

Most-discussed entities:Anthropic · 19Claude · 8GPT-5 · 7OpenAI · 6Llama · 4

330 articles

AI × CryptoBearishCoinTelegraph · Apr 15🔥 8/10

🤖

North Korean hackers used AI-enabled social engineering in Zerion attack

North Korean hackers executed a sophisticated attack on Zerion using AI-enabled social engineering tactics, marking the second major long-term social engineering campaign this month following the $280 million Drift Protocol exploit. The incident demonstrates how threat actors are leveraging artificial intelligence to enhance the effectiveness and scale of credential compromise attacks against cryptocurrency platforms.

AI × CryptoBearishCoinDesk · Apr 137/10

🤖

AI agents are set to power crypto payments, but a hidden flaw could expose wallets

Researchers have identified a critical vulnerability in AI infrastructure layers used for cryptocurrency payments, where intermediary systems can intercept sensitive wallet data. The flaw has reportedly enabled credential theft and at least one $500,000 wallet drain, exposing a significant security gap as AI agents become more integrated into crypto transaction systems.

AIBearishFortune Crypto · Apr 10🔥 8/10

🧠

The AI that found 27-year-old vulnerabilities no human ever caught before just forced an emergency meeting with every major Wall Street CEO

Anthropic's latest AI model discovered 27-year-old security vulnerabilities that human researchers missed, prompting Treasury Secretary Scott Bessent and Fed Chair Jerome Powell to convene an emergency meeting with major Wall Street CEOs. The incident highlights critical gaps in legacy system security and raises questions about AI's expanding role in identifying financial infrastructure risks.

🏢 Anthropic

AIBearishCoinDesk · Apr 107/10

🧠

Mythos AI threat prompts Bessent, Powell to convene bank CEOs for urgent talks

Treasury Secretary Bessent and Federal Reserve Chair Powell are convening bank CEOs for urgent discussions following concerns about Mythos, an AI system capable of rapidly identifying software vulnerabilities and developing sophisticated exploits. The meeting addresses fears that such AI capabilities could pose systemic risks to financial institutions and banking infrastructure.

AIBearishDaily Hodl · 1d ago7/10

🧠

Pennsylvania Bank Issues Urgent Alert After AI Application Triggers Data Breach, Exposing Sensitive Customer Info

Community Bank, a Pennsylvania-based financial institution, disclosed a data breach caused by an AI application that exposed customer names, social security numbers, and dates of birth. The breach, reported to the SEC, highlights emerging cybersecurity vulnerabilities in AI-powered banking systems and raises concerns about enterprise AI security practices across the financial sector.

AIBearishDecrypt – AI · 1d ago7/10

🧠

What Is an AI Prompt Injection Attack? The Hidden Threat Hijacking Your Chatbots

Prompt injection attacks allow hackers to manipulate AI chatbots like ChatGPT, Claude, and Gemini through adversarial text inputs, potentially hijacking their behavior and outputs. OpenAI has indicated this vulnerability may be inherent to large language models and difficult to fully eliminate, raising significant security concerns for enterprises and individual users relying on these systems.

🏢 OpenAI🧠 ChatGPT🧠 Claude

AI × CryptoBearishFortune Crypto · 2d ago7/10

🤖

The AI arms race in cybersecurity has started. Most companies aren’t ready

An emerging AI arms race in cybersecurity has begun, with threat actors leveraging artificial intelligence for sophisticated attacks while most organizations lack adequate defensive measures. Coinbase's security leadership highlights the urgency for companies to adopt AI-powered security strategies to counter evolving threats.

AIBearisharXiv – CS AI · 2d ago7/10

🧠

Audio Jailbreaks in Large Audio-Language Models: Taxonomy, Attack-Defense Analysis, and Cost-Aware Evaluation

Researchers have developed a comprehensive taxonomy of jailbreak attacks and defenses for Large Audio Language Models (LALMs), identifying vulnerabilities across semantic, acoustic, signal, and embedding layers. The study reveals that current defenses create tradeoffs between robustness and usability, highlighting the need for cost-aware safety evaluation beyond simple success-rate metrics.

AIBearisharXiv – CS AI · 2d ago7/10

🧠

Token Inflation: How Dishonest Providers Can Overcharge for Large Language Model Usage

Researchers demonstrate that LLM providers can systematically inflate token counts billed to users, with hidden reasoning tokens inflatable by up to 1,469% without detection. The core issue stems from a fundamental audit paradox: providers control both the tokenizer and execution, making verification impossible without independent verification mechanisms like trusted execution attestation or cryptographic proofs.

AIBullisharXiv – CS AI · 2d ago7/10

🧠

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Researchers introduce AgentDoG 1.5, a lightweight AI safety framework designed to protect open-world agents like OpenClaw from emerging security risks. The framework uses only ~1k training samples to create efficient models (0.8B-8B parameters) that match closed-source alternatives while reducing deployment overhead by 100x, with all resources released openly.

🧠 GPT-5

AIBearisharXiv – CS AI · 2d ago7/10

🧠

Finding DoRI: Discovery of Retained Images in Diffusion Models

Researchers challenge the assumption that memorization in text-to-image diffusion models can be localized to specific weights, demonstrating that pruning efforts can be bypassed through minor text embedding perturbations. The study reveals memorization is distributed throughout embedding space, suggesting current mitigation strategies are fundamentally fragile and requiring new approaches to protect training data privacy.

AIBearisharXiv – CS AI · 2d ago7/10

🧠

Evaluating Dataset Watermarking for Fine-tuning Traceability of Customized Diffusion Models: A Comprehensive Benchmark and Removal Approach

Researchers have established the first comprehensive evaluation framework for dataset watermarking in fine-tuned diffusion models, revealing significant vulnerabilities in existing protection methods. While current watermarking techniques show promise in universality and transmissibility, the study demonstrates practical watermark removal methods that can eliminate these protections without degrading model performance, exposing critical gaps in copyright and security safeguards.

AIBullisharXiv – CS AI · 2d ago7/10

🧠

KYA: A Framework-Agnostic Trust Layer for Autonomous Systems with Verifiable Provenance and Hierarchical Policy Composition

KYA (Know Your Agents) is an open-source trust and governance framework for autonomous systems that enables verifiable authorization, policy compliance, and post-hoc auditability across multi-agent environments. The system demonstrates strong security performance, detecting 89% of adversarial attacks while maintaining sub-millisecond latency and supporting 15+ agent frameworks.

AI × CryptoNeutralarXiv – CS AI · 2d ago7/10

🤖

Agora: Toward Autonomous Bug Detection in Production-Level Consensus Protocols with LLM Agents

Researchers introduced Agora, a multi-agent LLM framework designed to detect deep logic bugs in consensus protocols used by blockchains and distributed systems. The system discovered 15 previously unknown protocol-level bugs in major implementations (Raft, EPaxos, HotStuff, BullShark) that existing LLM approaches failed to identify, demonstrating the effectiveness of domain-aware collaborative AI for protocol verification.

AIBearishArs Technica – AI · 2d ago7/10

🧠

Fed up with vibe coders, dev sneaks data-nuking prompt injection into their code

A developer embedded a prompt injection attack into the jqwik library that instructed AI coding agents to delete application output, highlighting vulnerabilities in AI-assisted development tools. The incident reveals how malicious actors can compromise open-source projects to target AI systems, creating risks for developers relying on autonomous coding agents.

AIBullisharXiv – CS AI · 3d ago7/10

🧠

SPARD: Defending Harmful Fine-Tuning Attack via Safety Projection with Relevance-Diversity Data Selection

Researchers propose SPARD, a defense framework that protects large language models from harmful fine-tuning attacks by combining safety-constrained optimization with intelligent data selection. The method maintains task performance while significantly reducing adversarial attacks that attempt to remove safety guardrails from AI systems.

AINeutralarXiv – CS AI · 3d ago7/10

🧠

I Hear, Therefore I Trust: A Socio-Technical Investigation of Humans as Synthetic Speech Detectors

Researchers conducted a study with 47 participants to evaluate how humans detect synthetic speech, testing detection accuracy across authentic, fully synthetic, and partially synthetic utterances under various trust manipulation conditions. The findings reveal that humans perform poorly at detecting fully synthetic speech (below-chance levels) and that trust cues like instructional framing and provenance labeling do not significantly improve detection, though they influence detection behavior.