1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs 论文

2014引用 887
Stochastic Gradient Optimization TechniquesAdvanced Neural Network ApplicationsNeural Networks and Applications

1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs · 相关技术