Distributed In-Memory Computing on Binary RRAM Crossbar
The recently emerging resistive random-access memory (RRAM) can provide nonvolatile memory storage but also intrinsic computing for matrix-vector multiplication, which is ideal for the low-power and high-throughput data analytics accelerator performed in memory. However, the existing RRAM crossbar--...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/85628 http://hdl.handle.net/10220/43796 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-85628 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-856282020-03-07T13:57:28Z Distributed In-Memory Computing on Binary RRAM Crossbar Ni, Leibin Huang, Hantao Liu, Zichuan Joshi, Rajiv V. Yu, Hao School of Electrical and Electronic Engineering RRAM crossbar Hardware accelerator The recently emerging resistive random-access memory (RRAM) can provide nonvolatile memory storage but also intrinsic computing for matrix-vector multiplication, which is ideal for the low-power and high-throughput data analytics accelerator performed in memory. However, the existing RRAM crossbar--based computing is mainly assumed as a multilevel analog computing, whose result is sensitive to process nonuniformity as well as additional overhead from AD-conversion and I/O. In this article, we explore the matrix-vector multiplication accelerator on a binary RRAM crossbar with adaptive 1-bit-comparator--based parallel conversion. Moreover, a distributed in-memory computing architecture is also developed with the according control protocol. Both memory array and logic accelerator are implemented on the binary RRAM crossbar, where the logic-memory pair can be distributed with the control bus protocol. Experimental results have shown that compared to the analog RRAM crossbar, the proposed binary RRAM crossbar can achieve significant area savings with better calculation accuracy. Moreover, significant speedup can be achieved for matrix-vector multiplication in neural network--based machine learning such that the overall training and testing time can be both reduced. In addition, large energy savings can be also achieved when compared to the traditional CMOS-based out-of-memory computing architecture. NRF (Natl Research Foundation, S’pore) MOE (Min. of Education, S’pore) Accepted version 2017-09-26T08:13:47Z 2019-12-06T16:07:20Z 2017-09-26T08:13:47Z 2019-12-06T16:07:20Z 2017 Journal Article Ni, L., Huang, H., Liu, Z., Joshi, R. V., & Yu, H. (2017). Distributed In-Memory Computing on Binary RRAM Crossbar. ACM Journal on Emerging Technologies in Computing Systems, 13(3), 36-. 1550-4832 https://hdl.handle.net/10356/85628 http://hdl.handle.net/10220/43796 10.1145/2996192 en ACM Journal on Emerging Technologies in Computing Systems © 2017 Association for Computing Machinery (ACM). This is the author created version of a work that has been peer reviewed and accepted for publication by ACM Journal on Emerging Technologies in Computing Systems, Association for Computing Machinery (ACM). It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [http://dx.doi.org/10.1145/2996192]. 18 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
RRAM crossbar Hardware accelerator |
spellingShingle |
RRAM crossbar Hardware accelerator Ni, Leibin Huang, Hantao Liu, Zichuan Joshi, Rajiv V. Yu, Hao Distributed In-Memory Computing on Binary RRAM Crossbar |
description |
The recently emerging resistive random-access memory (RRAM) can provide nonvolatile memory storage but also intrinsic computing for matrix-vector multiplication, which is ideal for the low-power and high-throughput data analytics accelerator performed in memory. However, the existing RRAM crossbar--based computing is mainly assumed as a multilevel analog computing, whose result is sensitive to process nonuniformity as well as additional overhead from AD-conversion and I/O. In this article, we explore the matrix-vector multiplication accelerator on a binary RRAM crossbar with adaptive 1-bit-comparator--based parallel conversion. Moreover, a distributed in-memory computing architecture is also developed with the according control protocol. Both memory array and logic accelerator are implemented on the binary RRAM crossbar, where the logic-memory pair can be distributed with the control bus protocol. Experimental results have shown that compared to the analog RRAM crossbar, the proposed binary RRAM crossbar can achieve significant area savings with better calculation accuracy. Moreover, significant speedup can be achieved for matrix-vector multiplication in neural network--based machine learning such that the overall training and testing time can be both reduced. In addition, large energy savings can be also achieved when compared to the traditional CMOS-based out-of-memory computing architecture. |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Ni, Leibin Huang, Hantao Liu, Zichuan Joshi, Rajiv V. Yu, Hao |
format |
Article |
author |
Ni, Leibin Huang, Hantao Liu, Zichuan Joshi, Rajiv V. Yu, Hao |
author_sort |
Ni, Leibin |
title |
Distributed In-Memory Computing on Binary RRAM Crossbar |
title_short |
Distributed In-Memory Computing on Binary RRAM Crossbar |
title_full |
Distributed In-Memory Computing on Binary RRAM Crossbar |
title_fullStr |
Distributed In-Memory Computing on Binary RRAM Crossbar |
title_full_unstemmed |
Distributed In-Memory Computing on Binary RRAM Crossbar |
title_sort |
distributed in-memory computing on binary rram crossbar |
publishDate |
2017 |
url |
https://hdl.handle.net/10356/85628 http://hdl.handle.net/10220/43796 |
_version_ |
1681048656342417408 |