Framewise phoneme classification with bidirectional lstm and other neural network architectures 论文
2005引用 354
Speech Recognition and SynthesisMusic and Audio ProcessingTopic Modeling
摘要
Abstract — In this paper, we apply bidirectional training to a Long Short Term Memory (LSTM) network for the first time. We also present a modified, full gradient version of the LSTM learning algorithm. On the TIMIT speech database, we measure the framewise phoneme classification ability of bidirectional and unidirectional variants of both LSTM and conventional Recurrent Neural Networks (RNNs). We find that the LSTM architecture outperforms conventional RNNs and that bidirectional networks outperform unidirectional ones. I.