βBack to feed
π§ AIβͺ NeutralImportance 4/10
From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs
π€AI Summary
Researchers developed a new training framework to address contextual exposure bias in Speech-LLMs, where models trained on perfect conversation history perform poorly with error-prone real-world context. Their approach combines teacher error knowledge, context dropout, and direct preference optimization to improve robustness, achieving WER reductions from 5.59% to 5.17% on TED-LIUM 3.
Key Takeaways
- βSpeech-LLMs suffer from contextual exposure bias when trained on perfect oracle history but deployed with error-prone real conversation context.
- βThe proposed unified training framework uses Whisper hypotheses, context dropout, and Direct Preference Optimization to improve robustness.
- βExperiments showed consistent improvements on both in-domain TED-LIUM 3 and zero-shot LibriSpeech datasets under realistic conditions.
- βThe DPO approach demonstrated best resilience against irrelevant-context attacks with minimal performance degradation.
- βCode and models are publicly available, enabling broader adoption of the contextual speech recognition improvements.
#speech-recognition#large-language-models#contextual-asr#whisper#direct-preference-optimization#machine-learning#arxiv-research
Read Original βvia arXiv β CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β you keep full control of your keys.
Related Articles