#risk-assessment News & Analysis

59 articles tagged with #risk-assessment. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

59 articles

AINeutralarXiv – CS AI · Mar 117/10

🧠

OOD-MMSafe: Advancing MLLM Safety from Harmful Intent to Hidden Consequences

Researchers introduce OOD-MMSafe, a new benchmark revealing that current Multimodal Large Language Models fail to identify hidden safety risks up to 67.5% of the time. They developed CASPO framework which dramatically reduces failure rates to under 8% for risk identification in consequence-driven safety scenarios.

AIBearishTechCrunch – AI · Mar 67/10

🧠

Anthropic to challenge DOD’s supply chain label in court

Anthropic CEO Dario Amodei announced plans to legally challenge the Department of Defense's designation of the AI company as a supply chain risk. The CEO stated that most of Anthropic's customers remain unaffected by this regulatory label.

🏢 Anthropic

AINeutralarXiv – CS AI · Mar 57/10

🧠

Goal-Driven Risk Assessment for LLM-Powered Systems: A Healthcare Case Study

Researchers propose a new goal-driven risk assessment framework for LLM-powered systems, specifically targeting healthcare applications. The approach uses attack trees to identify detailed threat vectors combining adversarial AI attacks with conventional cyber threats, addressing security gaps in LLM system design.

AI × CryptoBearishCryptoPotato · Mar 2🔥 8/109

🤖

World War III Scenario: Which Crypto Would Suffer the Most? (4 AIs Respond)

Four AI models analyzed a hypothetical World War III scenario to identify which cryptocurrencies would be most vulnerable to massive price declines. The analysis suggests certain tokens could potentially plummet by 90% in such extreme geopolitical conditions.

AINeutralarXiv – CS AI · Feb 277/105

🧠

LLM Novice Uplift on Dual-Use, In Silico Biology Tasks

A research study found that novice users with access to large language models were 4.16 times more accurate on biosecurity-relevant tasks compared to those using only internet resources. The study raises concerns about dual-use risks as 89.6% of participants reported easily obtaining potentially dangerous biological information despite AI safeguards.

AINeutralGoogle DeepMind Blog · Apr 27/106

🧠

Taking a responsible path to AGI

The article discusses the development of Artificial General Intelligence (AGI) with an emphasis on responsible development practices. The focus is on technical safety, proactive risk assessment, and collaborative approaches within the AI community.

AINeutralHugging Face Blog · May 247/107

🧠

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

CyberSecEval 2 is a comprehensive evaluation framework designed to assess cybersecurity risks and capabilities of Large Language Models. The framework aims to provide standardized metrics for evaluating AI model security vulnerabilities and defensive capabilities in cybersecurity contexts.

AINeutralOpenAI News · Jan 317/103

🧠

Building an early warning system for LLM-aided biological threat creation

Researchers developed a framework to assess whether large language models could help create biological threats, testing GPT-4 with biology experts and students. The study found GPT-4 provides only mild assistance in biological threat creation, though results aren't conclusive and require further research.

AINeutralCrypto Briefing · Jun 256/10

🧠

Anthropic hires Stanford economist Chad Jones to assess AI risks

Anthropic has hired Stanford economist Chad Jones to develop economic frameworks for assessing AI risks and opportunities. This move reflects the AI safety industry's growing recognition that rigorous economic analysis is essential for understanding and mitigating existential risks posed by advanced artificial intelligence systems.

🏢 Anthropic

GeneralBearishCrypto Briefing · Jun 256/10

📰

SpaceX bond sale signals bubble territory, warns Allianz CIO

Allianz's Chief Investment Officer has warned that SpaceX's recent bond sale signals excessive valuations in private markets, particularly for high-growth technology companies. The warning reflects broader concerns about inflated asset prices and suggests investors should recalibrate their exposure to speculative growth assets.

AINeutralarXiv – CS AI · Jun 256/10

🧠

SciRisk-Bench: A Risk-Dimension-Aware Benchmark for AI4Science Safety

Researchers introduce SciRisk-Bench, a comprehensive safety benchmark for evaluating AI language models in scientific applications across 7 disciplines and 10 risk dimensions. The benchmark addresses growing concerns about LLM safety in high-stakes scientific contexts where errors could have serious consequences.

CryptoBearishDecrypt · Jun 186/10

⛓️

Ireland Tightens Crypto Safeguards in New Financial Crime Action Plan

Ireland has released a new National Risk Assessment identifying crypto-asset misuse as a top financial crime threat and implemented a 30-point action plan to strengthen regulatory oversight of cryptocurrency funds. This regulatory tightening reflects growing government focus on preventing illicit use of digital assets.

AI × CryptoNeutralBlockonomi · Jun 56/10

🤖

Free AI Trading Bot Tools in 2026: What Beginners Should Know Before Testing Automation

The article examines the proliferation of free AI trading bot tools in 2026, emphasizing that free access through demos, paper trading, and limited trials should not be mistaken for trustworthiness. Beginners are cautioned to view these offerings as testing mechanisms rather than endorsements of platform reliability or trading performance.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Risk Assessment of Autonomous Driving: Integrating Technical Failures, Ethical Dilemmas, and Policy Frameworks

Researchers analyzing autonomous vehicle safety data from NHTSA, California DMV, and MIT datasets identify perception and classification errors as primary technical failure modes, while highlighting divergent ethical frameworks and inconsistent regulatory approaches across jurisdictions as critical barriers to safe, widespread deployment.

CryptoNeutralCoinDesk · Jun 46/10

⛓️

Crypto for Advisors: The crypto due diligence questions you forgot to ask

As stablecoins mature, regulatory frameworks evolve, and AI-driven infrastructure advances, financial advisors must update their cryptocurrency due diligence processes to address gaps in their current assessment frameworks. The article highlights three critical questions advisors should reconsider to ensure comprehensive crypto risk evaluation.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Predicting the risk of colorectal anastomotic leak based on preoperative mapping of the blood supply of the bowel

Researchers have developed a protocol for an AI-driven system that uses CT imaging to predict the risk of anastomotic leak—a serious complication in colorectal cancer surgery. The framework integrates deep learning analysis of vascular features with a case-retrieval tool to support surgical decision-making, offering a reproducible methodology for hospitals and universities to implement precision surgery tools.

AINeutralarXiv – CS AI · Jun 26/10

🧠

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

Researchers introduce SeClaw, a framework for systematically evaluating security vulnerabilities in autonomous LLM agents through specification-driven task synthesis and execution-based testing. The tool addresses gaps in current agent security benchmarks by providing scalable, reproducible assessment of unsafe behaviors across diverse risk scenarios.

AINeutralarXiv – CS AI · May 126/10

🧠

A Semantic-Sampling Framework for Evaluating Calibration in Open-Ended Question Answering

Researchers introduce Sem-ECE, a new framework for evaluating how well large language models calibrate their confidence in open-ended question answering tasks. The method samples multiple answers from LLMs, groups them semantically, and uses answer frequency distributions as confidence measures, outperforming existing evaluation approaches across major commercial models.

DeFiBearishBitcoinist · May 26/10

💎

XRP Analyst Breaks Down Your Earnings If Deposited For Yield

Crypto analyst Iso Ledger has issued a cautionary assessment of earnXRP, a yield product associated with Upshift and the Flare Network, urging XRP holders to carefully evaluate the offering before depositing funds. The warning contrasts with promotional narratives about passive income opportunities, highlighting the importance of due diligence in emerging DeFi yield products.

$XRP

CryptoNeutralU.Today · Apr 206/10

⛓️

9/10 Shiba Inu (SHIB) Indicators Are in Green, But There's a Catch

Shiba Inu is displaying bullish technical indicators across 90% of tracked metrics, but analysts warn this activity surge may reflect unhealthy market dynamics rather than genuine fundamentals. The contradiction between positive signals and underlying concerns suggests investors should exercise caution despite apparent technical strength.

GeneralNeutralCrypto Briefing · Apr 176/10

📰

Israel lifts wartime restrictions, Independence Day ceremonies proceed

Israel has lifted wartime restrictions and is proceeding with Independence Day ceremonies, signaling a cautious shift toward regional stability despite ongoing tensions at its northern border. The move reflects efforts to normalize civilian life while security concerns remain elevated.

AINeutralarXiv – CS AI · Mar 36/103

🧠

LLMs as Strategic Actors: Behavioral Alignment, Risk Calibration, and Argumentation Framing in Geopolitical Simulations

A research study evaluated six state-of-the-art large language models in geopolitical crisis simulations, comparing their decision-making to human behavior. The study found that LLMs initially mirror human decisions but diverge over time, consistently exhibiting cooperative, stability-focused strategies with limited adversarial reasoning.

AINeutralarXiv – CS AI · Mar 27/1012

🧠

An Agentic LLM Framework for Adverse Media Screening in AML Compliance

Researchers have developed an agentic LLM framework using Retrieval-Augmented Generation to automate adverse media screening for anti-money laundering compliance in financial institutions. The system addresses high false-positive rates in traditional keyword-based approaches by implementing multi-step web searches and computing Adverse Media Index scores to distinguish between high-risk and low-risk individuals.

AIBearisharXiv – CS AI · Mar 27/1014

🧠

ForesightSafety Bench: A Frontier Risk Evaluation and Governance Framework towards Safe AI

Researchers have developed ForesightSafety Bench, a comprehensive AI safety evaluation framework covering 94 risk dimensions across 7 fundamental safety pillars. The benchmark evaluation of over 20 advanced large language models revealed widespread safety vulnerabilities, particularly in autonomous AI agents, AI4Science, and catastrophic risk scenarios.

AIBearisharXiv – CS AI · Mar 27/1019

🧠

Beyond Accuracy: Risk-Sensitive Evaluation of Hallucinated Medical Advice

Researchers propose a new risk-sensitive framework for evaluating AI hallucinations in medical advice that considers potential harm rather than just factual accuracy. The study reveals that AI models with similar performance show vastly different risk profiles when generating medical recommendations, highlighting critical safety gaps in current evaluation methods.

← PrevPage 2 of 3Next →