Adaptive-precision SIMD architecture for high-throughput and resource-efficient DNN acceleration

Trivedi, Vasundhara; Bagga, Harman Singh; Vishvakarma, Santosh Kumar

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/17843

Full metadata record

DC Field	Value	Language
dc.contributor.author	Trivedi, Vasundhara	en_US
dc.contributor.author	Bagga, Harman Singh	en_US
dc.contributor.author	Vishvakarma, Santosh Kumar	en_US
dc.date.accessioned	2026-02-10T15:50:13Z	-
dc.date.available	2026-02-10T15:50:13Z	-
dc.date.issued	2026	-
dc.identifier.citation	Trivedi, V., Raut, G., & Vishvakarma, S. K. (2026). Adaptive-precision SIMD architecture for high-throughput and resource-efficient DNN acceleration. Integration, 108. https://doi.org/10.1016/j.vlsi.2026.102666	en_US
dc.identifier.issn	0167-9260	-
dc.identifier.other	EID(2-s2.0-105027733065)	-
dc.identifier.uri	https://dx.doi.org/10.1016/j.vlsi.2026.102666	-
dc.identifier.uri	https://dspace.iiti.ac.in:8080/jspui/handle/123456789/17843	-
dc.description.abstract	Deep Neural Network (DNN) accelerators require high computational throughput and flexible precision support while operating under stringent resource and power constraints. To address these challenges, we propose an adaptive-precision SIMD (Single Instruction, Multiple Data) Processing Element (PE) architecture for signed integer and fixed-point operations that maximizes resource utilization and enhances parallelism in multiply–accumulate (MAC) computations. The design introduces efficient resource reuse during partial product accumulation and supports both symmetric and asymmetric precision modes. Unlike state-of-the-art approaches, the proposed PE dynamically scales computation: processing 16 operands at low precision (4-bit), four operands at medium precision (8-bit), and a single operand at high precision (16-bit). Additionally, it supports asymmetric operations such as 16 × 4-bit multiplications in parallel, enabling unique flexibility and performance gains. The architecture is implemented and tested on ASIC and FPGA platforms. Accuracy evaluations across different DNN models and datasets show very small losses at reduced precision—less than 1% for LeNet on MNIST, 2.9% for AlexNet on CIFAR-10, 2.2% for VGG16 on CIFAR-10, and 3.5% for VGG16 on ImageNet-1000 compared to float32. Hardware synthesis yields significant improvements, including 46.2% fewer LUTs and 2.45 × less power on FPGA compared to existing designs. The proposed architecture delivers 2× higher throughput, upto 4.8× energy efficiency with 28.57% less area at 65 nm, compared to existing works, making it ideal for applications with variable precision and limited resources. © 2026 Elsevier B.V.	en_US
dc.language.iso	en	en_US
dc.publisher	Elsevier B.V.	en_US
dc.source	Integration	en_US
dc.title	Adaptive-precision SIMD architecture for high-throughput and resource-efficient DNN acceleration	en_US
dc.type	Journal Article	en_US
Appears in Collections:	Department of Electrical Engineering

Files in This Item:

There are no files associated with this item.

Show simple item record

Altmetric Badge: