Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks 论文

2020Communications in Computational Physics引用 476

Model Reduction and Neural NetworksNeural Networks and ApplicationsStochastic Gradient Optimization Techniques

人工智能 Neural Networks and Applications Stochastic Gradient Optimization Techniques Model Reduction and Neural Networks

作者

摘要

We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis perspective. We demonstrate a very universal Frequency Principle (F-Principle) -- DNNs often fit target functions from low to high frequencies -- on high-dimensional benchmark datasets such as MNIST/CIFAR10 and deep neural networks such as VGG16. This F-Principle of DNNs is opposite to the behavior of most conventional iterative numerical schemes (e.g., Jacobi method), which exhibit faster convergence for higher frequencies for various scientific computing problems. With a simple theory, we illustrate that this F-Principle results from the regularity of the commonly used activation functions. The F-Principle implies an implicit bias that DNNs tend to fit training data by a low-frequency function. This understanding provides an explanation of good generalization of DNNs on most real datasets and bad generalization of DNNs on parity function or randomized dataset.

作者查看全部 (5)

Zheng Ma Zheng

Yanyang Xiao

T. Luo

Yaoyu Zhang

Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks 论文

摘要

作者查看全部 (5)

相关技术查看全部 (3)

相关事件

相关文章