SRAM-based compute-in-memory macros for artificial intelligence applications

With the booming of artificial intelligence technology, processing of intensive data in the traditional von Neumann hardware faces numerous challenges, such as power-hungry computing and unsatisfactory processing latency. However, for edge devices, especially battery-based ones, low power consumptio...

Full description

Saved in:

Bibliographic Details
Main Author:	Zhang, Xin
Other Authors:	Kim Tae Hyoung
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Engineering::Electrical and electronic engineering::Integrated circuits
Online Access:	https://hdl.handle.net/10356/173157
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-173157
record_format	dspace
spelling	sg-ntu-dr.10356-1731572024-02-01T09:53:44Z SRAM-based compute-in-memory macros for artificial intelligence applications Zhang, Xin Kim Tae Hyoung School of Electrical and Electronic Engineering THKIM@ntu.edu.sg Engineering::Electrical and electronic engineering::Integrated circuits With the booming of artificial intelligence technology, processing of intensive data in the traditional von Neumann hardware faces numerous challenges, such as power-hungry computing and unsatisfactory processing latency. However, for edge devices, especially battery-based ones, low power consumption is a critical and high-priority requirement. It is well-known that the “memory wall” between the computation and storage units requires frequent data transmission, which leads to considerable power consumption and longer processing latency. To break down the “memory wall”, compute-in-memory (CIM) is proposed as an attractive and promising method, where the storage and computation functions are both accomplished in the bit-cell array. Therefore, with the implementation of the CIM approach, energy efficiency is improved tremendously with minimum memory access. This thesis mainly explores the SRAM-based compute-in- memory macros for artificial intelligence applications to achieve higher energy efficiency and shorter processing delay. As the first example, an analog-based transposable CIM macro is proposed to accelerate both inference and training stages in the convolutional neural networks. Compared with the previous transposable works which accomplish the two-way processing with the shared unit, the proposed local transpose bit- cell can achieve two-direction data propagation and improve the array utilization rate to 100%. Then, aiming to eliminate the intrinsic limitations of analog-based CIM design, such as limited ADC precision and PVT variations, a digital-based CIM macro is introduced to achieve 400 MHz full-precision CIM processing. The no- accuracy loss processing and high energy efficiency are achieved within the proposed 64Kb CIM architecture because of the full-digital circuits. Finally, a full-digital versatile CIM macro is presented for accelerating various types of ML algorithms. The input and weight precision are reconfigurable from 1-bit to 16-bit. With the weight-stationary addition, operands-stationary addition, and bit-seral multiplication achieved in the CIM macro, self-organizing maps, and convolutional neural networks can be accelerated by the proposed architecture. Doctor of Philosophy 2024-01-18T12:01:16Z 2024-01-18T12:01:16Z 2023 Thesis-Doctor of Philosophy Zhang, X. (2023). SRAM-based compute-in-memory macros for artificial intelligence applications. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/173157 https://hdl.handle.net/10356/173157 10.32657/10356/173157 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering::Integrated circuits
spellingShingle	Engineering::Electrical and electronic engineering::Integrated circuits Zhang, Xin SRAM-based compute-in-memory macros for artificial intelligence applications
description	With the booming of artificial intelligence technology, processing of intensive data in the traditional von Neumann hardware faces numerous challenges, such as power-hungry computing and unsatisfactory processing latency. However, for edge devices, especially battery-based ones, low power consumption is a critical and high-priority requirement. It is well-known that the “memory wall” between the computation and storage units requires frequent data transmission, which leads to considerable power consumption and longer processing latency. To break down the “memory wall”, compute-in-memory (CIM) is proposed as an attractive and promising method, where the storage and computation functions are both accomplished in the bit-cell array. Therefore, with the implementation of the CIM approach, energy efficiency is improved tremendously with minimum memory access. This thesis mainly explores the SRAM-based compute-in- memory macros for artificial intelligence applications to achieve higher energy efficiency and shorter processing delay. As the first example, an analog-based transposable CIM macro is proposed to accelerate both inference and training stages in the convolutional neural networks. Compared with the previous transposable works which accomplish the two-way processing with the shared unit, the proposed local transpose bit- cell can achieve two-direction data propagation and improve the array utilization rate to 100%. Then, aiming to eliminate the intrinsic limitations of analog-based CIM design, such as limited ADC precision and PVT variations, a digital-based CIM macro is introduced to achieve 400 MHz full-precision CIM processing. The no- accuracy loss processing and high energy efficiency are achieved within the proposed 64Kb CIM architecture because of the full-digital circuits. Finally, a full-digital versatile CIM macro is presented for accelerating various types of ML algorithms. The input and weight precision are reconfigurable from 1-bit to 16-bit. With the weight-stationary addition, operands-stationary addition, and bit-seral multiplication achieved in the CIM macro, self-organizing maps, and convolutional neural networks can be accelerated by the proposed architecture.
author2	Kim Tae Hyoung
author_facet	Kim Tae Hyoung Zhang, Xin
format	Thesis-Doctor of Philosophy
author	Zhang, Xin
author_sort	Zhang, Xin
title	SRAM-based compute-in-memory macros for artificial intelligence applications
title_short	SRAM-based compute-in-memory macros for artificial intelligence applications
title_full	SRAM-based compute-in-memory macros for artificial intelligence applications
title_fullStr	SRAM-based compute-in-memory macros for artificial intelligence applications
title_full_unstemmed	SRAM-based compute-in-memory macros for artificial intelligence applications
title_sort	sram-based compute-in-memory macros for artificial intelligence applications
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/173157
_version_	1789968690498764800

SRAM-based compute-in-memory macros for artificial intelligence applications

Similar Items