y0news
← Feed
Back to feed
🧠 AI NeutralImportance 4/10

From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs

arXiv – CS AI|Xiaoyong Guo, Nanjie Li, Zijie Zeng, Kai Wang, Hao Huang, Haihua Xu, Wei Shi|
🤖AI Summary

Researchers developed a new training framework to address contextual exposure bias in Speech-LLMs, where models trained on perfect conversation history perform poorly with error-prone real-world context. Their approach combines teacher error knowledge, context dropout, and direct preference optimization to improve robustness, achieving WER reductions from 5.59% to 5.17% on TED-LIUM 3.

Key Takeaways
  • Speech-LLMs suffer from contextual exposure bias when trained on perfect oracle history but deployed with error-prone real conversation context.
  • The proposed unified training framework uses Whisper hypotheses, context dropout, and Direct Preference Optimization to improve robustness.
  • Experiments showed consistent improvements on both in-domain TED-LIUM 3 and zero-shot LibriSpeech datasets under realistic conditions.
  • The DPO approach demonstrated best resilience against irrelevant-context attacks with minimal performance degradation.
  • Code and models are publicly available, enabling broader adoption of the contextual speech recognition improvements.
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles