Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India 文章

ArXiv CS.CL2026-05-26NEWSen作者: Kaushal Bhogale, Manas Dhir, Amritansh Walecha, Manmeet Kaur, Vanshika Chhabra, Aaditya Pareek, Hanuman Sidh, Mahima Manik, Sagar Jain, Bhaskar Singh, Utkarsh Singh, Tahir Javed, Shobhit Banga, Mitesh M. Khapra

查看原文 →

关系图谱

摘要

arXiv:2604.19151v2 Announce Type: replace Abstract: Existing Indic ASR benchmarks often use scripted, clean speech and leaderboard driven evaluation that encourages dataset specific overfitting. In addition, strict single reference WER penalizes natural spelling variation in Indian languages, including non standardized spellings of code-mixed English origin words. To address these limitations, we introduce Voice of India, a closed source benchmark built from unscripted telephonic conversations covering 15 major Indian languages across 139 regional clusters. The dataset contains 306230 utterances, totaling 536 hours of speech from 36691 speakers with transcripts accounting for spelling variations. We also analyze performance geographically at the district level, revealing disparities.

Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (2)

相关技术