AIBullisharXiv β CS AI Β· 5h ago1
π§
When Scaling Fails: Mitigating Audio Perception Decay of LALMs via Multi-Step Perception-Aware Reasoning
Researchers identified a critical problem in Large Audio-Language Models (LALMs) where audio perception deteriorates during extended reasoning processes. They developed MPARΒ² framework using reinforcement learning, which improved perception performance from 31.74% to 63.51% and achieved 74.59% accuracy on MMAU benchmark.