Racetrack memory based logic design for in-memory computing

In-memory computing has been demonstrated to be an efficient computing infrastructure in the big data era for many applications such as graph processing and encryption. The area and power overhead of CMOS technology based memory design is growing rapidly because of the increasing data capacity and l...

Full description

Saved in:

Bibliographic Details
Main Author:	Luo, Tao
Other Authors:	Douglas Leslie Maskell
Format:	Theses and Dissertations
Language:	English
Published:	2018
Subjects:	DRNTU::Engineering::Computer science and engineering:
Online Access:	http://hdl.handle.net/10356/73359
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-73359
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering:
spellingShingle	DRNTU::Engineering::Computer science and engineering: Luo, Tao Racetrack memory based logic design for in-memory computing
description	In-memory computing has been demonstrated to be an efficient computing infrastructure in the big data era for many applications such as graph processing and encryption. The area and power overhead of CMOS technology based memory design is growing rapidly because of the increasing data capacity and leakage power along with the shrinking technology node. Thus, a newly introduced emerging memory technology, racetrack memory, is proposed to increase the data capacity and power efficiency of modern memory systems. As the design requirements of the conventional logic are different from that of the emerging memory based logic for in-memory computing, the conventional well-developed CMOS technology based logic designs are less relevant to the emerging memory based in-memory computing. Therefore, novel logic designs for racetrack memory are required. Traditional logic design with separate chips is focusing on high speed, which causes large area and power consumption. Implementing efficient logic design for in-memory computing is challenging due to the demanding requirement for area and power. Firstly, as the computing logic for in-memory computing is built in memory, the available area budget is limited, otherwise the data density of the memory system would be affected. Secondly, due to the thermal constraint of the memory chip, the available energy budget for computing logic design is limited. Large energy consumption may cause malfunction and even permanent damage to the memory chip because of high temperature. Finally, the adoption of emerging memory technologies makes the logic design more challenging due to their unique characteristics such as the sequential access mechanism of racetrack memory. This thesis addresses the above challenges in racetrack memory based in-memory logic design as follows. First, for general computing operations, we first propose racetrack memory based half and full adders The proposed magnetic full adder is implemented with pre-charged sense amplifiers (PCSA) and magnetic tunnel junctions (MTJ). By reusing parts of the logic design, the magnetic full adder significantly improves the area and energy efficiency compared with CMOS-based full adder and the state-of-the-art magnetic full adder. Second, based on the proposed magnetic full adder, we propose a pipelined Booth multiplier by exploring the inherent sequential access mechanism of racetrack memory, which achieves high area and energy efficiency. In order to increase the throughput of proposed Booth multiplier, we further parallelize the generation and addition of the partial products of the proposed Booth multiplier. Unlike the area- and energy-consuming adder array architecture in conventional CMOS technology based designs, the proposed multiplier utilizes a weight-based parallel architecture. In order to ensure the high energy efficiency, we propose an optimization that transforms the energy-demanding write operations to shift operations. With this optimization, the weight-based parallel multiplier achieves high throughput while maintaining high area and energy efficiency. Third, for specific applications, we propose an efficient racetrack memory based design to accelerate modular multiplication. Modular multiplication is widely used in various applications such as cryptography, number theory, group theory, ring theory, knot theory, abstract algebra, computer algebra, computer science, chemistry and the visual and musical arts. In order to implement modular multiplication efficiently, a novel two-stage scalable modular multiplication algorithm is proposed to significantly reduce the delay. An efficient architecture based on racetrack memory is further developed to reduce the number of required adders. Racetrack memory based application specific design for modular multiplication shows significant improvement compared with the state-of-the-art CMOS technology based implementation in area, energy, and performance. Overall, this thesis has made contributions to address the challenges in racetrack memory based in-memory logic design, and we demonstrate significant improvements in terms of area overhead and energy consumption in comparison with the state-of-the-art CMOS technology based logic design.
author2	Douglas Leslie Maskell
author_facet	Douglas Leslie Maskell Luo, Tao
format	Theses and Dissertations
author	Luo, Tao
author_sort	Luo, Tao
title	Racetrack memory based logic design for in-memory computing
title_short	Racetrack memory based logic design for in-memory computing
title_full	Racetrack memory based logic design for in-memory computing
title_fullStr	Racetrack memory based logic design for in-memory computing
title_full_unstemmed	Racetrack memory based logic design for in-memory computing
title_sort	racetrack memory based logic design for in-memory computing
publishDate	2018
url	http://hdl.handle.net/10356/73359
_version_	1759853746618105856
spelling	sg-ntu-dr.10356-733592023-03-04T00:51:58Z Racetrack memory based logic design for in-memory computing Luo, Tao Douglas Leslie Maskell School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering: In-memory computing has been demonstrated to be an efficient computing infrastructure in the big data era for many applications such as graph processing and encryption. The area and power overhead of CMOS technology based memory design is growing rapidly because of the increasing data capacity and leakage power along with the shrinking technology node. Thus, a newly introduced emerging memory technology, racetrack memory, is proposed to increase the data capacity and power efficiency of modern memory systems. As the design requirements of the conventional logic are different from that of the emerging memory based logic for in-memory computing, the conventional well-developed CMOS technology based logic designs are less relevant to the emerging memory based in-memory computing. Therefore, novel logic designs for racetrack memory are required. Traditional logic design with separate chips is focusing on high speed, which causes large area and power consumption. Implementing efficient logic design for in-memory computing is challenging due to the demanding requirement for area and power. Firstly, as the computing logic for in-memory computing is built in memory, the available area budget is limited, otherwise the data density of the memory system would be affected. Secondly, due to the thermal constraint of the memory chip, the available energy budget for computing logic design is limited. Large energy consumption may cause malfunction and even permanent damage to the memory chip because of high temperature. Finally, the adoption of emerging memory technologies makes the logic design more challenging due to their unique characteristics such as the sequential access mechanism of racetrack memory. This thesis addresses the above challenges in racetrack memory based in-memory logic design as follows. First, for general computing operations, we first propose racetrack memory based half and full adders The proposed magnetic full adder is implemented with pre-charged sense amplifiers (PCSA) and magnetic tunnel junctions (MTJ). By reusing parts of the logic design, the magnetic full adder significantly improves the area and energy efficiency compared with CMOS-based full adder and the state-of-the-art magnetic full adder. Second, based on the proposed magnetic full adder, we propose a pipelined Booth multiplier by exploring the inherent sequential access mechanism of racetrack memory, which achieves high area and energy efficiency. In order to increase the throughput of proposed Booth multiplier, we further parallelize the generation and addition of the partial products of the proposed Booth multiplier. Unlike the area- and energy-consuming adder array architecture in conventional CMOS technology based designs, the proposed multiplier utilizes a weight-based parallel architecture. In order to ensure the high energy efficiency, we propose an optimization that transforms the energy-demanding write operations to shift operations. With this optimization, the weight-based parallel multiplier achieves high throughput while maintaining high area and energy efficiency. Third, for specific applications, we propose an efficient racetrack memory based design to accelerate modular multiplication. Modular multiplication is widely used in various applications such as cryptography, number theory, group theory, ring theory, knot theory, abstract algebra, computer algebra, computer science, chemistry and the visual and musical arts. In order to implement modular multiplication efficiently, a novel two-stage scalable modular multiplication algorithm is proposed to significantly reduce the delay. An efficient architecture based on racetrack memory is further developed to reduce the number of required adders. Racetrack memory based application specific design for modular multiplication shows significant improvement compared with the state-of-the-art CMOS technology based implementation in area, energy, and performance. Overall, this thesis has made contributions to address the challenges in racetrack memory based in-memory logic design, and we demonstrate significant improvements in terms of area overhead and energy consumption in comparison with the state-of-the-art CMOS technology based logic design. Doctor of Philosophy (SCE) 2018-02-28T02:25:49Z 2018-02-28T02:25:49Z 2018 Thesis Luo, T. (2018). Racetrack memory based logic design for in-memory computing. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/73359 10.32657/10356/73359 en 127 p. application/pdf

Racetrack memory based logic design for in-memory computing

Similar Items