Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/12711
Title: An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks
Authors: Raut, Gopal
Karkun, Saurabh
Vishvakarma, Santosh Kumar
Keywords: Additional Key Words and PhrasesASIC;CORDIC;DNN accelerators;enhance throughput;modular architecture;Pareto analysis;pipeline MAC
Issue Date: 2023
Publisher: Association for Computing Machinery
Citation: Raut, G., Karkun, S., & Vishvakarma, S. K. (2023). An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks. ACM Transactions on Reconfigurable Technology and Systems. Scopus. https://doi.org/10.1145/3596220
Abstract: Practical implementation of deep neural networks (DNNs) demands significant hardware resources, necessitating high computational power and memory bandwidth. While existing field-programmable gate array (FPGA)-based DNN accelerators are primarily optimized for fast single-task performance, cost, energy efficiency, and overall throughput are crucial considerations for their practical use in various applications. This article proposes a performance-centric pipeline Coordinate Rotation Digital Computer (CORDIC)-based MAC unit and implements a scalable CORDIC-based DNN architecture that is area- and power-efficient and has high throughput. The CORDIC-based neuron engine uses bit-rounding to maintain input-output precision and minimal hardware resource overhead. The results demonstrate the versatility of the proposed pipelined MAC, which operates at 460 MHz and allows for higher network throughput. A software-based implementation platform evaluates the proposed MAC operation's accuracy for more extensive neural networks and complex datasets. The DNN accelerator with parameterized and modular layer-multiplexed architecture is designed. Empirical evaluation through Pareto analysis is used to improve the efficiency of DNN implementations by fixing the arithmetic precision and optimal pipeline stages. The proposed architecture utilizes layer-multiplexing, a technique that effectively reuses a single DNN layer to enhance efficiency while maintaining modularity and adaptability for integrating various network configurations. The proposed CORDIC MAC-based DNN architecture is scalable for any bit-precision network size, and the DNN accelerator is prototyped using the Xilinx Virtex-7 VC707 FPGA board, operating at 66 MHz. The proposed design does not use any Xilinx macros, making it easily adaptable for ASIC implementation. Compared with state-of-the-art designs, the proposed design reduces resource use by 45% and power consumption by 4× without sacrificing performance. The accelerator is validated using the MNIST dataset, achieving 95.06% accuracy, only 0.35% less than other cutting-edge implementations. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
URI: https://doi.org/10.1145/3596220
https://dspace.iiti.ac.in/handle/123456789/12711
ISSN: 1936-7406
Type of Material: Journal Article
Appears in Collections:Department of Electrical Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: