Please use this identifier to cite or link to this item:
https://dspace.iiti.ac.in/handle/123456789/15937
Title: | Flex-PE: Flexible and SIMD Multiprecision Processing Element for AI Workloads |
Authors: | Lokhande, Mukul Raut, Gopal Vishvakarma, Santosh Kumar |
Keywords: | Activation functions (AFs);CORDIC;deep learning accelerators;multiprecision systolic arrays;single instruction, multiple data (SIMD) processing elements |
Issue Date: | 2025 |
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
Citation: | Lokhande, M., Raut, G., & Vishvakarma, S. K. (2025). Flex-PE: Flexible and SIMD Multiprecision Processing Element for AI Workloads. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. https://doi.org/10.1109/TVLSI.2025.3553069 |
Abstract: | The rapid evolution of artificial intelligence (AI) models, from deep neural networks (DNNs) to transformers/large-language models (LLMs), demands flexible hardware solutions to meet diverse execution needs across edge and cloud platforms. Existing accelerators lack unified support for multiprecision arithmetic and runtime-configurable activation functions (AFs). This work proposes Flex-PE, a single instruction, multiple data (SIMD)-enabled multiprecision processing element that efficiently integrates multiply-and-accumulate operations with configurable AFs using unified hardware, including Sigmoid, Tanh, ReLU, and SoftMax. The proposed design achieves throughput improvements of up to 16x FxP4, 8x FxP8, 4x FxP16, and 1x FxP32, with maximum hardware efficiency for both iterative and pipelined architectures. An area-efficient iterative Flex-PE-based SIMD systolic array reduces DMA reads by up to 62x and 371x for input feature maps and weight filters in VGG-16, achieving 8.42 GOPS/W energy efficiency with minimal accuracy loss (<2%). Flex-PE scales from 4-bit edge inference to FxP8/16/32, supporting edge and cloud high-performance computing (HPC) while providing high-performance adaptable AI hardware with optimal precision, throughput, and energy efficiency. © 1993-2012 IEEE. |
URI: | https://doi.org/10.1109/TVLSI.2025.3553069 https://dspace.iiti.ac.in/handle/123456789/15937 |
ISSN: | 1063-8210 |
Type of Material: | Journal Article |
Appears in Collections: | Department of Electrical Engineering |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge: