The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training 论文

2009International Conference on Artificial Intelligence and Statistics引用 323

Generative Adversarial Networks and Image SynthesisMachine Learning and Data ClassificationNeural Networks and Applications

人工智能 Neural Networks and Applications Machine Learning and Data Classification Generative Adversarial Networks and Image Synthesis

关系图谱

作者

摘要

Whereas theoretical work suggests that deep architectures might be more e cient at representing highly-varying functions, training deep architectures was unsuccessful until the recent advent of algorithms based on unsupervised pretraining. Even though these new algorithms have enabled training deep models, many questions remain as to the nature of this di cult learning problem. Answering these questions is important if learning in deep architectures is to be further improved. We attempt to shed some light on these questions through extensive simulations. The experiments confirm and clarify the advantage of unsupervised pre-training. They demonstrate the robustness of the training procedure with respect to the random initialization, the positive e ect of pre-training in terms of optimization and its role as a regularizer. We empirically show the influence of pre-training with respect to architecture depth, model capacity, and number of training examples.

作者查看全部 (4)

Pascal Vincent

Samy Bengio

Pierre-Antoine Manzagol

Dumitru Erhan

The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training 论文

详细信息

摘要

作者查看全部 (4)

相关技术查看全部 (3)

相关事件

相关文章