AINeutralarXiv – CS AI · 6h ago5/10
🧠
A Comparative Study of Pretrained Transformer Models for Quranic ASR: Speech Representations, Label Formats, and Dataset Composition
Researchers developed improved Automatic Speech Recognition (ASR) models for Quranic recitation using pretrained Transformer architectures (Wav2Vec2.0, HuBERT, XLS-R), achieving 8% word error rates compared to 16.3% baseline performance. The study demonstrates that domain-specific fine-tuning with 870+ hours of professional and user-recited Quranic audio, combined with Arabic text without diacritics, significantly enhances transcription accuracy while reducing training time by 71%.