Zhang, Ming; Xu, Jian; He, Jinzhong; Qin, Hong Source: 2023 International Conference on High Performance Big Data and Intelligent Systems, HDIS 2023, p 118-123, 2023, 2023 International Conference on High Performance Big Data and Intelligent Systems, HDIS 2023;
Abstract:
Deep convolutional neural networks (DCNs) have recently experienced rapid development in the direction of lightweight and edge deployment. However, accelerators for DCNs face challenges in balancing computational and data bandwidth, leading to inefficient computation and high hardware costs. Additionally, different network structures make it challenging to design and reconfigure accelerators flexibly. To address these issues, this paper proposes a parallel-serial channel accelerator system, which resolves the low utilization of multipliers caused by small channels and inadequate bandwidth of fully connected layers. The results demonstrate that the proposed accelerator in this study maintains high computational performance and efficiency on typical DCNs. When implemented on Xilinx VCU128 at 200 MHz, the peak computational performance reaches 204.5 GOPS, with an efficiency of 0.37 GOPS/DSP and a maximum utilization rate of computing array up to 99.63%, surpassing previous works.
©2023 IEEE. (20 refs.)