Area-Optimized 2D Interleaved Adder Tree Design for Sparse DCIM Edge Processing

Sankhe, Akash; Lokhande, Mukul; Sharma, Radheshyam; Vishvakarma, Santosh Kumar

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/16303

Full metadata record

DC Field	Value	Language
dc.contributor.author	Sankhe, Akash	en_US
dc.contributor.author	Lokhande, Mukul	en_US
dc.contributor.author	Sharma, Radheshyam	en_US
dc.contributor.author	Vishvakarma, Santosh Kumar	en_US
dc.date.accessioned	2025-06-20T06:39:35Z	-
dc.date.available	2025-06-20T06:39:35Z	-
dc.date.issued	2025	-
dc.identifier.citation	Sankhe, A., Lokhande, M., Sharma, R., & Vishvakarma, S. K. (2025). Area-Optimized 2D Interleaved Adder Tree Design for Sparse DCIM Edge Processing. Proceedings International Symposium on Quality Electronic Design Isqed. https://doi.org/10.1109/ISQED65160.2025.11014431	en_US
dc.identifier.issn	1948-3287	-
dc.identifier.other	EID(2-s2.0-105007524117)	-
dc.identifier.uri	https://dx.doi.org/10.1109/ISQED65160.2025.11014431	-
dc.identifier.uri	https://dspace.iiti.ac.in/handle/123456789/16303	-
dc.description.abstract	Recently, SRAM-embedded compute-in-memory (CIM) hardware has emerged as a promising solution to mitigate von-Neumann bottlenecks and has shown noteworthy improvements in energy efficiency and throughput for matrix-vector multiplication, a significant portion of neural networks. While PVT variations significantly impact traditional analog/mixed-signal (AMS) macros, the DCIM macros are more robust. This article proposes a DCIM macro that incorporates an 8-transistor SRAM bitcell capable of performing 1-bit multiplications and addressing the bit-flip issue arising from the simultaneous activation of multiple array rows. The macro also includes a 2D interleaved adder tree constructed using a novel 7T-based ripple carry adder (RCA), significantly reducing the adder tree's area. The proposed 16Kb DCIM macro computes 64 parallel products in a single clock cycle. It demonstrates 2 × higher energy efficiency than recent state-of-the-art works at 65nm CMOS. The macro is validated at 250MHz and achieves the classification accuracy of 98.7%, 98.8% for 1A4W precision, and 99.1%, 97.8% for 4A4W precision, for LeNet-5 architecture using MNIST and A-Z alphabet datasets respectively. © 2025 IEEE.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE Computer Society	en_US
dc.source	Proceedings - International Symposium on Quality Electronic Design, ISQED	en_US
dc.subject	Compute-in-memory (CIM)	en_US
dc.subject	DNN (deep neural networks)	en_US
dc.subject	Edge-AI accelerators	en_US
dc.subject	multiply-and-accumulate	en_US
dc.subject	reconfigurable precision	en_US
dc.title	Area-Optimized 2D Interleaved Adder Tree Design for Sparse DCIM Edge Processing	en_US
dc.type	Conference Paper	en_US
Appears in Collections:	Department of Electrical Engineering

Files in This Item:

There are no files associated with this item.

Show simple item record

Altmetric Badge: