An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks

Raut, Gopal; Karkun, Saurabh; Vishvakarma, Santosh Kumar

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/12711

Full metadata record

DC Field	Value	Language
dc.contributor.author	Raut, Gopal	en_US
dc.contributor.author	Karkun, Saurabh	en_US
dc.contributor.author	Vishvakarma, Santosh Kumar	en_US
dc.date.accessioned	2023-12-14T12:38:16Z	-
dc.date.available	2023-12-14T12:38:16Z	-
dc.date.issued	2023	-
dc.identifier.citation	Raut, G., Karkun, S., & Vishvakarma, S. K. (2023). An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks. ACM Transactions on Reconfigurable Technology and Systems. Scopus. https://doi.org/10.1145/3596220	en_US
dc.identifier.issn	1936-7406	-
dc.identifier.other	EID(2-s2.0-85168803248)	-
dc.identifier.uri	https://doi.org/10.1145/3596220	-
dc.identifier.uri	https://dspace.iiti.ac.in/handle/123456789/12711	-
dc.description.abstract	Practical implementation of deep neural networks (DNNs) demands significant hardware resources, necessitating high computational power and memory bandwidth. While existing field-programmable gate array (FPGA)-based DNN accelerators are primarily optimized for fast single-task performance, cost, energy efficiency, and overall throughput are crucial considerations for their practical use in various applications. This article proposes a performance-centric pipeline Coordinate Rotation Digital Computer (CORDIC)-based MAC unit and implements a scalable CORDIC-based DNN architecture that is area- and power-efficient and has high throughput. The CORDIC-based neuron engine uses bit-rounding to maintain input-output precision and minimal hardware resource overhead. The results demonstrate the versatility of the proposed pipelined MAC, which operates at 460 MHz and allows for higher network throughput. A software-based implementation platform evaluates the proposed MAC operation's accuracy for more extensive neural networks and complex datasets. The DNN accelerator with parameterized and modular layer-multiplexed architecture is designed. Empirical evaluation through Pareto analysis is used to improve the efficiency of DNN implementations by fixing the arithmetic precision and optimal pipeline stages. The proposed architecture utilizes layer-multiplexing, a technique that effectively reuses a single DNN layer to enhance efficiency while maintaining modularity and adaptability for integrating various network configurations. The proposed CORDIC MAC-based DNN architecture is scalable for any bit-precision network size, and the DNN accelerator is prototyped using the Xilinx Virtex-7 VC707 FPGA board, operating at 66 MHz. The proposed design does not use any Xilinx macros, making it easily adaptable for ASIC implementation. Compared with state-of-the-art designs, the proposed design reduces resource use by 45% and power consumption by 4× without sacrificing performance. The accelerator is validated using the MNIST dataset, achieving 95.06% accuracy, only 0.35% less than other cutting-edge implementations. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.	en_US
dc.language.iso	en	en_US
dc.publisher	Association for Computing Machinery	en_US
dc.source	ACM Transactions on Reconfigurable Technology and Systems	en_US
dc.subject	Additional Key Words and PhrasesASIC	en_US
dc.subject	CORDIC	en_US
dc.subject	DNN accelerators	en_US
dc.subject	enhance throughput	en_US
dc.subject	modular architecture	en_US
dc.subject	Pareto analysis	en_US
dc.subject	pipeline MAC	en_US
dc.title	An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks	en_US
dc.type	Journal Article	en_US
Appears in Collections:	Department of Electrical Engineering

Files in This Item:

There are no files associated with this item.

Show simple item record

Altmetric Badge: