Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/12711
Full metadata record
DC FieldValueLanguage
dc.contributor.authorRaut, Gopalen_US
dc.contributor.authorKarkun, Saurabhen_US
dc.contributor.authorVishvakarma, Santosh Kumaren_US
dc.date.accessioned2023-12-14T12:38:16Z-
dc.date.available2023-12-14T12:38:16Z-
dc.date.issued2023-
dc.identifier.citationRaut, G., Karkun, S., & Vishvakarma, S. K. (2023). An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks. ACM Transactions on Reconfigurable Technology and Systems. Scopus. https://doi.org/10.1145/3596220en_US
dc.identifier.issn1936-7406-
dc.identifier.otherEID(2-s2.0-85168803248)-
dc.identifier.urihttps://doi.org/10.1145/3596220-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/12711-
dc.description.abstractPractical implementation of deep neural networks (DNNs) demands significant hardware resources, necessitating high computational power and memory bandwidth. While existing field-programmable gate array (FPGA)-based DNN accelerators are primarily optimized for fast single-task performance, cost, energy efficiency, and overall throughput are crucial considerations for their practical use in various applications. This article proposes a performance-centric pipeline Coordinate Rotation Digital Computer (CORDIC)-based MAC unit and implements a scalable CORDIC-based DNN architecture that is area- and power-efficient and has high throughput. The CORDIC-based neuron engine uses bit-rounding to maintain input-output precision and minimal hardware resource overhead. The results demonstrate the versatility of the proposed pipelined MAC, which operates at 460 MHz and allows for higher network throughput. A software-based implementation platform evaluates the proposed MAC operation's accuracy for more extensive neural networks and complex datasets. The DNN accelerator with parameterized and modular layer-multiplexed architecture is designed. Empirical evaluation through Pareto analysis is used to improve the efficiency of DNN implementations by fixing the arithmetic precision and optimal pipeline stages. The proposed architecture utilizes layer-multiplexing, a technique that effectively reuses a single DNN layer to enhance efficiency while maintaining modularity and adaptability for integrating various network configurations. The proposed CORDIC MAC-based DNN architecture is scalable for any bit-precision network size, and the DNN accelerator is prototyped using the Xilinx Virtex-7 VC707 FPGA board, operating at 66 MHz. The proposed design does not use any Xilinx macros, making it easily adaptable for ASIC implementation. Compared with state-of-the-art designs, the proposed design reduces resource use by 45% and power consumption by 4× without sacrificing performance. The accelerator is validated using the MNIST dataset, achieving 95.06% accuracy, only 0.35% less than other cutting-edge implementations. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.en_US
dc.language.isoenen_US
dc.publisherAssociation for Computing Machineryen_US
dc.sourceACM Transactions on Reconfigurable Technology and Systemsen_US
dc.subjectAdditional Key Words and PhrasesASICen_US
dc.subjectCORDICen_US
dc.subjectDNN acceleratorsen_US
dc.subjectenhance throughputen_US
dc.subjectmodular architectureen_US
dc.subjectPareto analysisen_US
dc.subjectpipeline MACen_US
dc.titleAn Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networksen_US
dc.typeJournal Articleen_US
Appears in Collections:Department of Electrical Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: