y0news
#referring-expression1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 6h ago5
๐Ÿง 

Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks

Researchers introduce Ref-Adv, a new benchmark for testing multimodal large language models' visual reasoning capabilities in referring expression tasks. The benchmark reveals that current MLLMs, despite performing well on standard datasets like RefCOCO, rely heavily on shortcuts and show significant gaps in genuine visual reasoning and grounding abilities.