Cross-entropy vs. squared error training: a theoretical and experimental comparison 论文

2013引用 221

Neural Networks and ApplicationsMachine Learning and AlgorithmsMachine Learning and Data Classification

人工智能 Neural Networks and Applications Machine Learning and Data Classification Machine Learning and Algorithms

作者

摘要

In this paper we investigate the error criteria that are optimized during the training of artificial neural networks (ANN). We compare the bounds of the squared error (SE) and the crossentropy (CE) criteria being the most popular choices in stateof-the art implementations. The evaluation is performed on automatic speech recognition (ASR) and handwriting recognition (HWR) tasks using a hybrid HMM-ANN model. We find that with randomly initialized weights, the squared error based ANN does not converge to a good local optimum. However, with a good initialization by pre-training, the word error rate of our best CE trained system could be reduced from 30.9 % to 30.5% on the ASR, and from 22.7 % to 21.9 % on the HWR task by performing a few additional “fine-tuning ” iterations with the SE criterion. Index Terms: hybrid approach, training criterion for ANN training, automatic speech recognition, handwriting recognition

作者查看全部 (3)

Hermann Ney

Patrick Doetsch

Pavel Golik

Cross-entropy vs. squared error training: a theoretical and experimental comparison 论文

摘要

作者查看全部 (3)

相关技术查看全部 (3)

相关事件

相关文章