AINeutralarXiv – CS AI · 18h ago6/10
🧠
IMUG-Bench: Benchmarking Unified Multimodal Models on Interleaved Understanding and Generation
Researchers introduce IMUG-Bench, a comprehensive benchmark designed to evaluate unified multimodal models (UMMs) on their ability to handle multi-turn interleaved image-text dialogues. The benchmark reveals that current models struggle with exposure bias in generation tasks and that test-time scaling strategies like Chain-of-Thought can improve performance.