AIBearisharXiv β CS AI Β· 14h ago7/10
π§
VeriSim: A Configurable Framework for Evaluating Medical AI Under Realistic Patient Noise
Researchers introduce VeriSim, an open-source framework that tests medical AI systems by injecting realistic patient communication barriersβsuch as memory gaps and health literacy limitationsβinto clinical simulations. Testing across seven LLMs reveals significant performance degradation (15-25% accuracy drop), with smaller models suffering 40% greater decline than larger ones, exposing a critical gap between standardized benchmarks and real-world clinical robustness.