←Back to feed
🧠 AI⚪ NeutralImportance 4/10
From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs
🤖AI Summary
Researchers developed a new training framework to address contextual exposure bias in Speech-LLMs, where models trained on perfect conversation history perform poorly with error-prone real-world context. Their approach combines teacher error knowledge, context dropout, and direct preference optimization to improve robustness, achieving WER reductions from 5.59% to 5.17% on TED-LIUM 3.
Key Takeaways
- →Speech-LLMs suffer from contextual exposure bias when trained on perfect oracle history but deployed with error-prone real conversation context.
- →The proposed unified training framework uses Whisper hypotheses, context dropout, and Direct Preference Optimization to improve robustness.
- →Experiments showed consistent improvements on both in-domain TED-LIUM 3 and zero-shot LibriSpeech datasets under realistic conditions.
- →The DPO approach demonstrated best resilience against irrelevant-context attacks with minimal performance degradation.
- →Code and models are publicly available, enabling broader adoption of the contextual speech recognition improvements.
#speech-recognition#large-language-models#contextual-asr#whisper#direct-preference-optimization#machine-learning#arxiv-research
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Related Articles