AINeutralarXiv โ CS AI ยท 5d ago6/103
๐ง
OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models
Researchers introduce OmniSpatial, a comprehensive benchmark for testing spatial reasoning capabilities in vision-language models (VLMs). The benchmark reveals significant limitations in both open and closed-source VLMs across four major spatial reasoning categories, with over 8,400 question-answer pairs testing advanced cognitive abilities.
$NEAR