The Fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, Task and Baselines 论文

2018引用 288
Speech and Audio ProcessingSpeech Recognition and SynthesisMusic and Audio Processing

摘要

The CHiME challenge series aims to advance robust automatic speech\nrecognition (ASR) technology by promoting research at the interface of speech\nand language processing, signal processing , and machine learning. This paper\nintroduces the 5th CHiME Challenge, which considers the task of distant\nmulti-microphone conversational ASR in real home environments. Speech material\nwas elicited using a dinner party scenario with efforts taken to capture data\nthat is representative of natural conversational speech and recorded by 6\nKinect microphone arrays and 4 binaural microphone pairs. The challenge\nfeatures a single-array track and a multiple-array track and, for each track,\ndistinct rankings will be produced for systems focusing on robustness with\nrespect to distant-microphone capture vs. systems attempting to address all\naspects of the task including conversational language modeling. We discuss the\nrationale for the challenge and provide a detailed description of the data\ncollection procedure, the task, and the baseline systems for array\nsynchronization, speech enhancement, and conventional and end-to-end ASR.\n