Don't decay the learning rate, increase the batch size 论文

2018arXiv (Cornell University)引用 271
Advanced Neural Network ApplicationsDomain Adaptation and Few-Shot LearningStochastic Gradient Optimization Techniques

Don't decay the learning rate, increase the batch size · 相关文章

暂无数据