AINeutralarXiv – CS AI · 9h ago6/10
🧠
Almieyar-Oryx-BloomBench: A Bilingual Multimodal Benchmark for Cognitively Informed Evaluation of Vision-Language Models
Researchers introduce BloomBench, a bilingual English-Arabic benchmark grounded in Bloom's Taxonomy to rigorously evaluate Vision-Language Models across six cognitive levels. The study reveals that state-of-the-art VLMs excel at semantic understanding but struggle with factual recall and creative synthesis, while exposing significant performance gaps between Arabic and English reasoning tasks.