A Comparative Study of Pretrained Transformer Models for Quranic ASR: Speech Representations, Label Formats, and Dataset Composition 文章

ArXiv CS.AI2026-06-19NEWSen作者: Nabil Mosharraf Hossain (Greentech Apps Foundation, United Kingdom), Riasat Islam (Greentech Apps Foundation, United Kingdom, Queen Mary University of London, United Kingdom), Unaizah Obaidellah (University of Malaya, Malaysia)

详细信息

来源站点: ArXiv CS.AI
作者: Nabil Mosharraf Hossain (Greentech Apps Foundation, United Kingdom), Riasat Islam (Greentech Apps Foundation, United Kingdom, Queen Mary University of London, United Kingdom), Unaizah Obaidellah (University of Malaya, Malaysia)
文章类型: NEWS
语言: en
发布日期: 2026-06-19

原文

摘要

arXiv:2606.19747v1 Announce Type: new Abstract: Quran Automatic Speech Recognition (ASR) aims to convert Quranic recitation into text, enabling applications such as aided memorisation tools and Quranic search engines. However, existing ASR models often exhibit high Word Error Rates (WER) on user-recited verses and lack full coverage of the Quranic corpus. This paper presents a systematic empirical study of domain-specific fine-tuning of pretrained Transformer-based models for Quranic ASR, using advanced speech feature extraction methods: Wav2Vec2.0, HuBERT, and XLS-R. These models apply self-supervised learning by masking portions of input audio and using Transformer architectures to learn context-aware speech features. The pretrained models are fine-tuned on a filtered Quranic dataset exceeding 870 hours of professional and user recitations.

A Comparative Study of Pretrained Transformer Models for Quranic ASR: Speech Representations, Label Formats, and Dataset Composition 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (5)

相关技术查看全部 (5)