SRAM-based compute-in-memory macros for artificial intelligence applications

With the booming of artificial intelligence technology, processing of intensive data in the traditional von Neumann hardware faces numerous challenges, such as power-hungry computing and unsatisfactory processing latency. However, for edge devices, especially battery-based ones, low power consumptio...

全面介紹

Saved in:

書目詳細資料
主要作者:	Zhang, Xin
其他作者:	Kim Tae Hyoung
格式:	Thesis-Doctor of Philosophy
語言:	English
出版:	Nanyang Technological University 2024
主題:	Engineering::Electrical and electronic engineering::Integrated circuits
在線閱讀:	https://hdl.handle.net/10356/173157
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

實物特徵
總結:	With the booming of artificial intelligence technology, processing of intensive data in the traditional von Neumann hardware faces numerous challenges, such as power-hungry computing and unsatisfactory processing latency. However, for edge devices, especially battery-based ones, low power consumption is a critical and high-priority requirement. It is well-known that the “memory wall” between the computation and storage units requires frequent data transmission, which leads to considerable power consumption and longer processing latency. To break down the “memory wall”, compute-in-memory (CIM) is proposed as an attractive and promising method, where the storage and computation functions are both accomplished in the bit-cell array. Therefore, with the implementation of the CIM approach, energy efficiency is improved tremendously with minimum memory access. This thesis mainly explores the SRAM-based compute-in- memory macros for artificial intelligence applications to achieve higher energy efficiency and shorter processing delay. As the first example, an analog-based transposable CIM macro is proposed to accelerate both inference and training stages in the convolutional neural networks. Compared with the previous transposable works which accomplish the two-way processing with the shared unit, the proposed local transpose bit- cell can achieve two-direction data propagation and improve the array utilization rate to 100%. Then, aiming to eliminate the intrinsic limitations of analog-based CIM design, such as limited ADC precision and PVT variations, a digital-based CIM macro is introduced to achieve 400 MHz full-precision CIM processing. The no- accuracy loss processing and high energy efficiency are achieved within the proposed 64Kb CIM architecture because of the full-digital circuits. Finally, a full-digital versatile CIM macro is presented for accelerating various types of ML algorithms. The input and weight precision are reconfigurable from 1-bit to 16-bit. With the weight-stationary addition, operands-stationary addition, and bit-seral multiplication achieved in the CIM macro, self-organizing maps, and convolutional neural networks can be accelerated by the proposed architecture.

SRAM-based compute-in-memory macros for artificial intelligence applications

相似書籍