AINeutralarXiv – CS AI · 18h ago6/10
🧠
AVI-Bench: Toward Human-like Audio-Visual Intelligence of Omni-MLLMs
Researchers introduce AVI-Bench, a comprehensive benchmark for evaluating audio-visual intelligence in multimodal large language models across perception, understanding, and reasoning tasks. The study reveals significant limitations in current models and proposes a taxonomy to guide development of more robust audio-visual AI systems.