AINeutralarXiv – CS AI · 7h ago6/10
🧠
V-REX: Benchmarking Exploratory Visual Reasoning via Chain-of-Questions
Researchers introduce V-REX, a new evaluation benchmark for vision-language models that assesses their ability to perform complex, multi-step visual reasoning through Chain-of-Questions (CoQ) methodology. The framework disentangles VLMs' planning and information-gathering capabilities, revealing significant performance gaps and substantial room for improvement in exploratory visual reasoning tasks.