The microsoft 2016 conversational speech recognition system 论文

2017引用 336

Speech Recognition and SynthesisSpeech and Audio ProcessingMusic and Audio Processing

Speech Recognition and Synthesis Speech and Audio Processing Music and Audio Processing

作者

摘要

We describe Microsoft's conversational speech recognition system, in which we combine recent developments in neural-network-based acoustic and language modeling to advance the state of the art on the Switchboard recognition task. Inspired by machine learning ensemble techniques, the system uses a range of convolutional and recurrent neural networks. I-vector modeling and lattice-free MMI training provide significant gains for all acoustic model architectures. Language model rescoring with multiple forward and backward running RNNLMs, and word posterior-based system combination provide a 20% boost. The best single system uses a ResNet architecture acoustic model with RNNLM rescoring, and achieves a word error rate of 6.9% on the NIST 2000 Switchboard task. The combined system has an error rate of 6.2%, representing an improvement over previously reported results on this benchmark task.

作者查看全部 (7)

Geoffrey Zweig

Andreas Stolcke

Michael L. Seltzer

Frank Seide

The microsoft 2016 conversational speech recognition system 论文

摘要

作者查看全部 (7)

相关技术查看全部 (3)

相关事件

相关文章