Please use this identifier to cite or link to this item:
https://dspace.iiti.ac.in/handle/123456789/17842
Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Trivedi, Vasundhara | en_US |
| dc.contributor.author | Vishvakarma, Santosh Kumar | en_US |
| dc.date.accessioned | 2026-02-10T15:50:13Z | - |
| dc.date.available | 2026-02-10T15:50:13Z | - |
| dc.date.issued | 2026 | - |
| dc.identifier.citation | Trivedi, V., Raut, G., Mohammad, B. S., Vishvakarma, S. K., & Kumar, A. (2026). C-SIMD: CORDIC-Driven SIMD Processing Element for Resource-Efficient Multi-Precision Deep Learning Inference. IEEE Access. https://doi.org/10.1109/ACCESS.2026.3653253 | en_US |
| dc.identifier.other | EID(2-s2.0-105027538570) | - |
| dc.identifier.uri | https://dx.doi.org/10.1109/ACCESS.2026.3653253 | - |
| dc.identifier.uri | https://dspace.iiti.ac.in:8080/jspui/handle/123456789/17842 | - |
| dc.description.abstract | The growing demand for efficient deep learning inference on edge devices requires hardware that is both precision-adaptive and resource-efficient. This paper introduces C-SIMD, a CORDIC-driven, configurable SIMD Processing Element (PE) architecture for scalable, multi-precision MAC operations in DNN accelerators. C-SIMD supports dynamic operand precision (4/8/16/32-bit) and enables symmetric and asymmetric computation modes, covering integer and fixed-point arithmetic. By leveraging partial product computation with pipelined 8-bit CORDIC-based approximate multipliers, the architecture scales efficiently to higher precision while achieving notable area and power savings. A configurable pipeline offers tunable trade-offs between accuracy and complexity, making C-SIMD suitable for resource-constrained inference. Strategic reuse of the adder in the accumulation path enhances throughput and optimizes resource utilization. Unlike prior designs, C-SIMD fully exploits available resources and supports configurations such as 16 parallel 8×8-bit, 4 parallel 16×16-bit, single 32×32-bit, and asymmetric 32×8-bit MACs. Hardware evaluation demonstrates up to 14.29% area savings and as much as 16.17× throughput improvement. The proposed C-SIMD_Low (4/8/16) achieves 7.04 GOP/s, while C-SIMD_High (8/16/32) attains 4.16 GOP/s, delivering a 4× performance-efficiency gain over prior MAC architectures. Inference tests indicate minimal accuracy loss - below 1% on MNIST-LeNet, under 2.9% on CIFAR-10-AlexNet, and less than 2.2% on CIFAR-10-VGG16 compared to float32 baselines - demonstrating its potential for high-throughput, energy-efficient Edge-AI systems. © 2013 IEEE. | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | en_US |
| dc.source | IEEE Access | en_US |
| dc.title | C-SIMD: CORDIC-Driven SIMD Processing Element for Resource-Efficient Multi-Precision Deep Learning Inference | en_US |
| dc.type | Journal Article | en_US |
| dc.rights.license | All Open Access | - |
| dc.rights.license | Gold Open Access | - |
| Appears in Collections: | Department of Electrical Engineering | |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge: