AINeutralarXiv – CS AI · 7h ago6/10
🧠
Truth, Trust, and Trouble: Medical AI on the Edge
Researchers benchmarked open-source LLMs for medical question-answering, evaluating AlpaCare-13B, BioMistral-7B-DARE, and Mistral-7B across accuracy, safety, and helpfulness metrics. Results reveal fundamental trade-offs between factual reliability and harm prevention in medical AI systems, with implications for deploying these models in clinical settings.