Maximizing CNN Accelerator Efficiency Through Resource Partitioning 论文

2017引用 295
Advanced Neural Network ApplicationsAdvanced Memory and Neural ComputingAdversarial Robustness in Machine Learning

摘要

Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present significant computational challenges. Recently, many FPGA-based accelerators have been proposed to improve the performance and efficiency of CNNs. Current approaches construct a single processor that computes the CNN layers one at a time; the processor is optimized to maximize the throughput at which the collection of layers is computed. However, this approach leads to inefficient designs because the same processor structure is used to compute CNN layers of radically varying dimensions.