The DARPA 1000-word resource management database for continuous speech recognition 论文

2003引用 341
Speech and dialogue systemsSpeech Recognition and SynthesisNatural Language Processing Techniques

摘要

A database of continuous read speech has been designed and recorded within the DARPA strategic computing speech recognition program. The data is intended for use in designing and evaluating algorithms for speaker-independent, speaker-adaptive and speaker-dependent speech recognition. The data consists of read sentences appropriate to a naval resource management task built around existing interactive database and graphics programs. The 1000-word task vocabulary is intended to be logically complete and habitable. The database, which represents over 21000 recorded utterances from 160 talkers with a variety of dialects, includes a partition of sentences and talkers for training and for testing purposes.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>