AINeutralarXiv – CS AI · 3h ago7/10
🧠
KVoiceBench, KOpenAudioBench, and KMMAU: Agent-Driven Korean Speech Benchmarks for Evaluating SpeechLMs
Researchers introduce three new Korean speech benchmarks (KVoiceBench, KOpenAudioBench, and KMMAU) totaling 12,345 samples to evaluate multilingual speech language models, addressing the gap in non-English evaluation. The study reveals significant performance disparities between English and Korean across eight SpeechLMs, exposing weaknesses invisible to English-only testing.