y0news
#medical1 article
1 articles
AIBearisharXiv โ€“ CS AI ยท 6h ago2
๐Ÿง 

Prompt Sensitivity and Answer Consistency of Small Open-Source Large Language Models on Clinical Question Answering: Implications for Low-Resource Healthcare Deployment

Research evaluated five small open-source language models on clinical question answering, finding that high consistency doesn't guarantee accuracy - models can be reliably wrong. Llama 3.2 showed the best balance of accuracy and reliability, while roleplay prompts consistently reduced performance across all models.

$NEAR