#speech-enhancement News & Analysis

7 articles tagged with #speech-enhancement. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

7 articles

AINeutralarXiv – CS AI · Jun 256/10

🧠

SE-AGCNet: An End-to-End Framework for Joint Speech Enhancement and Loudness Control in Meeting Scenarios

Researchers propose SE-AGCNet, an end-to-end framework that jointly optimizes speech enhancement and automatic gain control for meeting scenarios. The approach addresses limitations of traditional discrete audio processing pipelines by leveraging synergy between the two tasks, improving speech quality, loudness consistency, and automatic speech recognition accuracy.

AINeutralarXiv – CS AI · Jun 196/10

🧠

QC-GAN: A Parameter-Efficient Quaternion Conformer GAN for High-Fidelity Speech Enhancement

Researchers introduce QC-GAN, a parameter-efficient speech enhancement model combining Quaternion Conformer architecture with MetricGAN training. The framework achieves state-of-the-art speech quality scores while using less than half the parameters of comparable models, with a 35K-parameter variant demonstrating viable ultra-lightweight performance.

AINeutralarXiv – CS AI · Jun 95/10

🧠

Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

Researchers propose a training-free method for improving automatic speech recognition in noisy environments by intelligently fusing noisy and speech-enhanced audio based on intelligibility estimates. The approach eliminates the need for trained neural predictors, reducing complexity while maintaining robustness across diverse speech enhancement and ASR model combinations.

AIBullisharXiv – CS AI · Mar 176/10

🧠

LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement

Researchers have developed a new audio-visual speech enhancement framework that uses Large Language Models and reinforcement learning to improve speech quality. The method outperforms existing baselines by using LLM-generated natural language feedback as rewards for model training, providing more interpretable optimization compared to traditional scalar metrics.

AIBullisharXiv – CS AI · Mar 55/10

🧠

MeanFlowSE: one-step generative speech enhancement via conditional mean flow

Researchers have developed MeanFlowSE, a new generative AI model for speech enhancement that performs single-step inference instead of requiring multiple computational steps. The method achieves strong audio quality with substantially lower computational costs, making it suitable for real-time applications without needing knowledge distillation or external teachers.

AIBullisharXiv – CS AI · Mar 27/1014

🧠

VoiceBridge: General Speech Restoration with One-step Latent Bridge Models

VoiceBridge is a new AI model that can restore high-quality 48kHz speech from various types of audio distortions using a single one-step process. The model uses a latent bridge approach with an energy-preserving variational autoencoder and transformer architecture to handle multiple speech restoration tasks simultaneously.

AINeutralarXiv – CS AI · Mar 34/103

🧠

CodecFlow: Efficient Bandwidth Extension via Conditional Flow Matching in Neural Codec Latent Space

CodecFlow is a new neural codec-based framework for speech bandwidth extension that efficiently reconstructs high-quality audio in compact latent space. The system uses conditional flow matching and residual vector quantization to improve speech clarity by restoring high-frequency content from low-bandwidth audio.