Design exploration and VLSI implementation of deep neural network accelerator with configurable architecture

Raut, Gopal R.

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/10728

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Vishvakarma, Santosh Kumar	-
dc.contributor.author	Raut, Gopal R.	-
dc.date.accessioned	2022-10-07T10:29:23Z	-
dc.date.available	2022-10-07T10:29:23Z	-
dc.date.issued	2022-09-21	-
dc.identifier.uri	https://dspace.iiti.ac.in/handle/123456789/10728	-
dc.description.abstract	Deep Learning is a subset of Artificial Intelligence involves deep neural networks. Despite many decades of research on high-performance deep neural network accelerators, their massive computa tional demand still requires resources, efficient architecture, parallel processing, and high memory bandwidth for computational acceleration. The implementation of DNNs faces the burden of ex cess area requirements due to resource-intensive elements such as multiply-and-accumulate (MAC) units and activation function (AF). Moreover, Edge-AI applications demand a high throughput DNN accelerator with more area utilization and high power consumption. In order to design an area-power efficient DNN accelerator at the cost of insignificant loss in throughput requires optimization in MAC design, AF, and network complexity for efficient data flow. Further, the ASIC-based hardware design of DNN faces the challenge of offering functional configurability and limited chip area. Addressing the hardware implementation of DNN and targeting the low-power and resource-constrained applications, this dissertation investigates the low-power and efficient VLSI architecture of DNN accelerators. We explore and optimize the Co-ordinate Rotation Digital Computer (CORDIC) architecture for the evaluation of MAC and non-linear AF operations. Despite being area and power-efficient, one of the significant drawbacks of CORDIC-based designs is their low throughput. Therefore, we propose a performance-centric pipelined architecture for CORDIC-based MAC and AF. Since pipeline stages come with more hardware resource utilization, we explore the mutual exclusivity between CORDIC stages and conduct a detailed study of accuracy variation concerning the number of stages required to achieve high throughput. We have given different topologies for the CORDIC-based MAC and AF with iterative and pipeline architecture. The proposed designs can be configured to compute both MAC and AF using the same hardware that allows for saving enormous hardware resources.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Electrical Engineering, IIT Indore	en_US
dc.relation.ispartofseries	TH464	-
dc.subject	Electrical Engineering	en_US
dc.title	Design exploration and VLSI implementation of deep neural network accelerator with configurable architecture	en_US
dc.type	Thesis_Ph.D	en_US
Appears in Collections:	Department of Electrical Engineering_ETD

Files in This Item:

File	Description	Size	Format
TH_464_Gopal_R._Raut_1701102005.pdf		17.6 MB	Adobe PDF	View/Open

Show simple item record

Altmetric Badge: