AINeutralarXiv – CS AI · 9h ago6/10
🧠
Physics-Based Benchmarking Metrics for Multimodal Synthetic Images
Researchers propose PCMDE, a new evaluation metric for synthetic multimodal images that combines large language models with vision-language models and physics-based reasoning to better assess semantic and structural accuracy than existing benchmarks like BLIP and CLIPScore. The three-stage approach addresses limitations in current metrics' ability to capture domain-specific and context-dependent image quality.