AINeutralarXiv – CS AI · 18h ago6/10
🧠
GlobeAudio: A Multilingual Multicultural Benchmark for Naturalistic Evaluation of Large Audio-Language Models
GlobeAudio, a new benchmark dataset, evaluates Large Audio-Language Models across six languages using 5,637 naturally-sourced audio questions. The research reveals significant performance gaps in current LALMs, particularly for open-source models and low-resource languages, highlighting critical limitations in how audio-language AI systems handle real-world acoustic conditions.
🏢 Hugging Face