y0news
AnalyticsDigestsSourcesRSSAICrypto
#multi-audio1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 3d ago7/10
๐Ÿง 

MUGEN: Evaluating and Improving Multi-audio Understanding of Large Audio-Language Models

Researchers introduce MUGEN, a comprehensive benchmark revealing significant weaknesses in large audio-language models when processing multiple concurrent audio inputs. The study shows performance degrades sharply with more audio inputs and proposes Audio-Permutational Self-Consistency as a training-free solution, achieving up to 6.74% accuracy improvements.