AIBearisharXiv – CS AI · 15h ago6/10
🧠
PitchBench: Measuring Pitch Hearing in Audio-Language Models
Researchers introduce PitchBench, a comprehensive evaluation suite that reveals audio-language models struggle significantly with pitch hearing—a fundamental musical perception task. The benchmark's 28 experiments expose inconsistent performance across different acoustic conditions, instrument types, and response formats, indicating current ALMs lack reliable pitch perception despite their growing real-world deployment in music applications.