SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
Researchers introduce SpatialScore, a comprehensive benchmark with 5K samples across 30 tasks to evaluate multimodal language models' spatial reasoning capabilities. The work includes SpatialCorpus, a 331K-sample training dataset, and SpatialAgent, a multi-agent system with 12 specialized tools, demonstrating significant improvements in spatial intelligence without additional model training.