Maximizing CNN Accelerator Efficiency Through Resource Partitioning 论文
2017引用 295
Advanced Neural Network ApplicationsAdvanced Memory and Neural ComputingAdversarial Robustness in Machine Learning
摘要
Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present significant computational challenges. Recently, many FPGA-based accelerators have been proposed to improve the performance and efficiency of CNNs. Current approaches construct a single processor that computes the CNN layers one at a time; the processor is optimized to maximize the throughput at which the collection of layers is computed. However, this approach leads to inefficient designs because the same processor structure is used to compute CNN layers of radically varying dimensions.