AINeutralarXiv – CS AI · 8h ago7/10
🧠
DrugBench: Evaluating AI Control Protocols for Medication Harm Mitigation
Researchers introduce DrugBench, a benchmark for evaluating AI safety protocols in medical LLM applications, combining 3,671 medical conversations with FDA drug data to test systems against medication-related harms. The study reveals that existing AI control mechanisms can be circumvented and proposes severity-based monitoring to better account for the potential consequences of unsafe outputs in clinical contexts.