Don't decay the learning rate, increase the batch size 论文
2018arXiv (Cornell University)引用 271
Advanced Neural Network ApplicationsDomain Adaptation and Few-Shot LearningStochastic Gradient Optimization Techniques
Don't decay the learning rate, increase the batch size · 相关文章
暂无数据