AIBullisharXiv – CS AI · 14h ago6/10
🧠
Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation
Researchers introduce Agentic ASR, a multi-turn interactive speech recognition framework that enables iterative refinement of recognized speech through semantic correction and reasoning-based editing. The approach addresses limitations of single-pass ASR systems by aligning with human communication patterns, introducing a new semantic evaluation metric (S²ER) that better captures meaning-critical errors than traditional token-level metrics.