#patient-safety News & Analysis

10 articles tagged with #patient-safety. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

10 articles

AINeutralarXiv – CS AI · May 77/10

🧠

Evaluating Patient Safety Risks in Generative AI: Development and Validation of a FMECA Framework for Generated Clinical Content

Researchers developed and validated the first FMECA (Failure Mode, Effects, and Criticality Analysis) framework to systematically assess patient safety risks in clinical summaries generated by large language models. Testing with GPT-OSS 120B on real hospital discharge summaries demonstrated moderate-to-substantial inter-rater agreement and identified 14 distinct failure modes, establishing a reproducible methodology for evaluating AI-generated clinical content safety.

AIBullisharXiv – CS AI · May 17/10

🧠

CareGuardAI: Context-Aware Multi-Agent Guardrails for Clinical Safety & Hallucination Mitigation in Patient-Facing LLMs

CareGuardAI is a safety framework designed to mitigate clinical risks and hallucinations in patient-facing medical LLMs through dual risk assessment mechanisms. The system employs context-aware multi-agent guardrails that evaluate both clinical safety and factual reliability before releasing responses, outperforming GPT-4o-mini on specialized healthcare benchmarks.

🧠 GPT-4

AIBearisharXiv – CS AI · Apr 67/10

🧠

When AI Gets it Wrong: Reliability and Risk in AI-Assisted Medication Decision Systems

A research paper examines reliability issues in AI-assisted medication decision systems, finding that even systems with good aggregate performance can produce dangerous errors in real-world healthcare scenarios. The study emphasizes that single incorrect AI recommendations in medication management can cause severe patient harm, highlighting the need for human oversight and risk-aware evaluation approaches.

AINeutralarXiv – CS AI · 4d ago6/10

🧠

Towards Error-Free EHRs: Reasoning-Intensive Consistency Verification Between Clinical Notes and Structured Tables in Electronic Health Records

Researchers introduce EHR-ReasonCon, a benchmark dataset and EHR-Inspector, an LLM-based framework designed to verify consistency between unstructured clinical notes and structured data in Electronic Health Records. The work addresses a critical gap in healthcare data quality by moving beyond simple value matching to capture clinical reasoning, temporal relationships, and event interpretations that reflect real-world documentation practices.

AIBullisharXiv – CS AI · May 116/10

🧠

Automated Evaluation can Distinguish the Good and Bad AI Responses to Patient Questions about Hospitalization

Researchers demonstrate that automated evaluation metrics can reliably assess AI-generated responses to patient hospitalization questions, matching human expert ratings across 2,800 responses from 28 AI systems. This approach addresses the scalability limitations of manual expert review while maintaining accuracy across three key dimensions: question answering, clinical evidence use, and medical knowledge application.

AINeutralarXiv – CS AI · May 96/10

🧠

Systematic Evaluation of Large Language Models for Post-Discharge Clinical Action Extraction

Researchers systematically evaluated large language models against supervised BERT models for extracting post-discharge clinical actions from narrative hospital notes. LLMs matched or exceeded supervised baselines on binary actionability detection but lagged on fine-grained multi-label classification, revealing that performance gaps stem from misalignment between model reasoning and annotation conventions rather than pure capability limitations.

AIBearishcrypto.news · Apr 116/10

🧠

AI Therapy Chatbots Face Growing State Bans as Maine Advances Bill and Missouri Follows

Maine and Missouri are advancing legislative bans on AI therapy chatbots, reflecting growing state-level regulatory skepticism toward AI-driven mental health services. This trend signals potential restrictions on a developing sector, though the movement remains fragmented across individual states without federal coordination.

AIBearishThe Register – AI · Mar 47/10

🧠

AI doctor's assistant is easily swayed to change prescriptions, give bad medical advice

Research reveals that AI-powered medical assistant systems can be easily manipulated to change prescriptions and provide harmful medical advice. The study highlights significant vulnerabilities in AI healthcare tools that could pose serious risks to patient safety.

AIBearisharXiv – CS AI · Mar 27/1019

🧠

Beyond Accuracy: Risk-Sensitive Evaluation of Hallucinated Medical Advice

Researchers propose a new risk-sensitive framework for evaluating AI hallucinations in medical advice that considers potential harm rather than just factual accuracy. The study reveals that AI models with similar performance show vastly different risk profiles when generating medical recommendations, highlighting critical safety gaps in current evaluation methods.

AIBearisharXiv – CS AI · Feb 276/107

🧠

ClinDet-Bench: Beyond Abstention, Evaluating Judgment Determinability of LLMs in Clinical Decision-Making

Researchers developed ClinDet-Bench, a new benchmark that reveals large language models fail to properly identify when they have sufficient information to make clinical decisions. The study shows LLMs make both premature judgments and excessive abstentions in medical scenarios, highlighting safety concerns for AI deployment in healthcare settings.