βBack to feed
π§ AIβͺ Neutral
Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics
π€AI Summary
Researchers introduce Whisper-RIR-Mega, a new benchmark dataset for testing automatic speech recognition robustness in reverberant acoustic environments. The study evaluates five Whisper models and finds that reverberation consistently degrades performance across all model sizes, with word error rates increasing by 0.12 to 1.07 percentage points.
Key Takeaways
- βWhisper-RIR-Mega pairs clean LibriSpeech utterances with reverberant versions using real room impulse responses for ASR testing.
- βAll five tested Whisper models (tiny through large-v3) showed performance degradation in reverberant conditions.
- βReverberation penalty in word error rate ranged from 0.12 to 1.07 percentage points depending on model size.
- βThe dataset includes stratified splits by reverberation time and direct-to-reverberant ratio for comprehensive evaluation.
- βResearchers released the dataset, evaluation code, and baseline results to support reproducible ASR research.
#whisper#asr#speech-recognition#acoustic-robustness#benchmark-dataset#machine-learning#audio-processing#research
Read Original βvia arXiv β CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β you keep full control of your keys.
Related Articles