The Fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, Task and Baselines 论文
摘要
The CHiME challenge series aims to advance robust automatic speech\nrecognition (ASR) technology by promoting research at the interface of speech\nand language processing, signal processing , and machine learning. This paper\nintroduces the 5th CHiME Challenge, which considers the task of distant\nmulti-microphone conversational ASR in real home environments. Speech material\nwas elicited using a dinner party scenario with efforts taken to capture data\nthat is representative of natural conversational speech and recorded by 6\nKinect microphone arrays and 4 binaural microphone pairs. The challenge\nfeatures a single-array track and a multiple-array track and, for each track,\ndistinct rankings will be produced for systems focusing on robustness with\nrespect to distant-microphone capture vs. systems attempting to address all\naspects of the task including conversational language modeling. We discuss the\nrationale for the challenge and provide a detailed description of the data\ncollection procedure, the task, and the baseline systems for array\nsynchronization, speech enhancement, and conventional and end-to-end ASR.\n