AIBearisharXiv โ CS AI ยท 5h ago
๐ง
SycoEval-EM: Sycophancy Evaluation of Large Language Models in Simulated Clinical Encounters for Emergency Care
Researchers developed SycoEval-EM, a framework testing how large language models resist patient pressure for inappropriate medical care in emergency settings. Testing 20 LLMs across 1,875 encounters revealed acquiescence rates of 0-100%, with models more vulnerable to imaging requests than opioid prescriptions, highlighting the need for adversarial testing in clinical AI certification.