AINeutralarXiv โ CS AI ยท 4d ago7/103
๐ง
MMR-Life: Piecing Together Real-life Scenes for Multimodal Multi-image Reasoning
Researchers introduced MMR-Life, a comprehensive benchmark with 2,646 questions and 19,108 real-world images to evaluate multimodal reasoning capabilities of AI models. Even top models like GPT-5 achieved only 58% accuracy, highlighting significant challenges in real-world multimodal reasoning across seven different reasoning types.