y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#speech-to-text News & Analysis

4 articles tagged with #speech-to-text. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBullisharXiv – CS AI · May 287/10
🧠

Bandwidth-Efficient and Privacy-Preserving Edge-Cloud Many-to-Many Speech Translation

Researchers introduce ESRT, a privacy-preserving edge-cloud framework for multilingual speech-to-text translation that processes voice data locally while transmitting only compressed features to the cloud. The system achieves state-of-the-art performance across 45 languages while reducing bandwidth requirements by 10x and preventing voiceprint leakage.

AINeutralarXiv – CS AI · May 286/10
🧠

Diffusion Large Language Models for Visual Speech Recognition

Researchers introduce DLLM-VSR, a diffusion-based large language model framework for visual speech recognition that replaces traditional left-to-right decoding with iterative masked denoising. The system achieves state-of-the-art 19.5% word error rate on LRS3 by using confidence-based unmasking and length-guided candidate decoding to resolve visual ambiguities.

AINeutralTechCrunch – AI · Apr 64/10
🧠

Google quietly releases an offline-first AI dictation app on iOS

Google has quietly launched a new offline-first AI dictation app for iOS that utilizes Gemma AI models. The app appears to be positioning itself as a competitor to existing dictation solutions like Wispr Flow by offering offline functionality.