AINeutralarXiv โ CS AI ยท 4h ago0
๐ง
AudioCapBench: Quick Evaluation on Audio Captioning across Sound, Music, and Speech
Researchers introduce AudioCapBench, a new benchmark for evaluating how well large multimodal AI models can generate captions for audio content across sound, music, and speech domains. The study tested 13 models from OpenAI and Google Gemini, finding that Gemini models generally outperformed OpenAI in overall captioning quality, though all models struggled most with music captioning.