SCAN: Streamlined Composite Activation Function Unit for Deep Neural Accelerators

Rajput, Gunjan; Biyani, Kunika Naresh; Logashree, V.; Vishvakarma, Santosh Kumar

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/9900

Title:	SCAN: Streamlined Composite Activation Function Unit for Deep Neural Accelerators
Authors:	Rajput, Gunjan Biyani, Kunika Naresh Logashree, V. Vishvakarma, Santosh Kumar
Keywords:	Acceleration\|Application specific integrated circuits\|Chemical activation\|Clocks\|Hyperbolic functions\|Integrated circuit design\|Multilayer neural networks\|Statistical tests\|Stochastic systems\|VLSI circuits\|Activation functions\|ASIC implementation\|Function unit\|Hyperbolic tangent\|Nonlinear functions\|Power gatings\|ReLU\|Stochastic computing\|Testing accuracy\|VLSI implementation\|Deep neural networks
Issue Date:	2022
Publisher:	Birkhauser
Citation:	Rajput, G., Biyani, K. N., Logashree, V., & Vishvakarma, S. K. (2022). SCAN: Streamlined composite activation function unit for deep neural accelerators. Circuits, Systems, and Signal Processing, doi:10.1007/s00034-021-01947-8
Abstract:	Transcendental nonlinear function design in deep neural accelerators principally concerns performance parameters such as area, power, delay, and throughput. Neural hardware demands resource-intensive blocks such as adders, multipliers, and nonlinear activation functions. This work addresses the issues related to the implementation of the activation function for deep neural accelerators. The proposed design implements an activation function unit with the help of stochastic computing along with clock gating techniques to reduce the active power dissipation in the hardware. But a complete deep neural network uses various activation functions in the hidden layers. To overcome the problem of implementing individual hardware design corresponding to each activation function, we have designed the streamlined composite activation function unit for neural accelerators (SCAN), which implements hyperbolic tangent and ReLU activation functions. The proposed method using stochastic computing along with clock gating is compared with other states of the art. The area is reduced by approximately 74.14% as compared to that of CORDIC-based design. While implementing a single neuron, both area and power are reduced by manifold, enhancing the performance of deep neural accelerators. Testing accuracy and inference time are calculated using the benchmark dataset (MNIST) on AlexNet architecture. Testing accuracy in the proposed implementation is increased by 1.08% , and loss is reduced by 40.66%. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
URI:	https://dspace.iiti.ac.in/handle/123456789/9900 https://doi.org/10.1007/s00034-021-01947-8
ISSN:	0278-081X
Type of Material:	Journal Article
Appears in Collections:	Department of Electrical Engineering

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: