AINeutralarXiv โ CS AI ยท 6h ago2
๐ง
Measuring What VLMs Don't Say: Validation Metrics Hide Clinical Terminology Erasure in Radiology Report Generation
Researchers identify a critical flaw in Vision-Language Model evaluation for radiology, where high benchmark scores mask models' failure to generate clinically specific terminology. They propose new metrics including Clinical Association Displacement (CAD) to measure bias and clinical signal loss across demographic groups.