Akademska digitalna zbirka SLovenije - logo
E-viri
Celotno besedilo
  • Guo, Yan-Cheng; Chang, Tian-Sheuan; Lin, Chih-Sheng; Chiou, Bo-Cheng; Lai, Chih-Ming; Sheu, Shyh-Shyuan; Lo, Wei-Chung; Chang, Shih-Chieh

    2024 IEEE International Symposium on Circuits and Systems (ISCAS), 2024-May-19
    Conference Proceeding

    Computing-in-memory (CIM) is renowned in deep learning due to its high energy efficiency resulting from highly parallel computing with minimal data movement. However, current SRAM-based CIM designs suffer from long latency for loading weight or feature maps from DRAM for large AI models. Moreover, previous SRAM-based CIM architectures lack end-to-end model inference. To address these issues, this paper proposes CIMR-V, an end-to-end CIM accelerator with RISC-V that incorporates CIM layer fusion, convolution/max pooling pipeline, and weight fusion, resulting in an 85.14% reduction in latency for the keyword spotting model. Furthermore, the proposed CIM-type instructions facilitate end-to-end AI model inference and full stack flow, effectively synergizing the high energy efficiency of CIM and the high programmability of RISC-V. Implemented using TSMC 28nm technology, the proposed design achieves an energy efficiency of 3707.84 TOPS/W and 26.21 TOPS at 50 MHz.