The OGI multi-language telephone speech corpus 论文

1992引用 253
Speech Recognition and SynthesisSpeech and dialogue systemsNatural Language Processing Techniques

摘要

The OGI Multi-language Telephone Speech Corpus is designed to support research on automatic language identification and multi-language speech recognition. The corpus consists of up to nine separate responses from each caller, ranging from single words to short topic-specific descriptions to 60 seconds of unconstrained spontaneous speech. The utterances were spoken over commercial telephone lines by speakers of English, Farsi (Persian), French, German, Japanese, Korean, Mandarin Chinese, Spanish, Tamil, and Vietnamese. We have completed the initial phase of our data acquisition effort: the recording and initial verification of utterances produced by 100 different speakers in each of the 10 languages. We describe the recording protocol, data collection procedure, ongoing corpus development, preliminary results of the statistical evaluation of the 10 languages, and plans to provide orthographic transcriptions of the speech. INTRODUCTION Research in multi-language recognition systems wou...