AIBearisharXiv – CS AI · 7h ago6/10
🧠
MentisOculi: Revealing the Limits of Reasoning with Mental Imagery
Researchers developed MentisOculi, a benchmark suite to test whether frontier multimodal AI models can use visual reasoning and mental imagery to solve complex problems. Testing shows that visual strategies—from latent tokens to generated images—fail to improve performance, revealing that despite their theoretical appeal, current models cannot effectively leverage visual thoughts for reasoning.