A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network
It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/87049 http://hdl.handle.net/10220/45222 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-87049 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-870492020-03-07T13:56:07Z A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network Huang, Hantao Ni, Leibin Wang, Kanwen Wang, Yuangang Yu, Hao School of Electrical and Electronic Engineering Tensorized Neural Network (TNN) RRAM Computing It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power in the proposed 3-D multilayer CMOS-RRAM accelerator. The adoption of tensorization can significantly compress the weight matrix of a neural network using much fewer parameters. Simulation results using the benchmark MNIST show that the proposed accelerator has 1.283× speed-up, 4.276× energy-saving, and 9.339× area-saving compared to the 3-D CMOS-ASIC implementation; and 6.37× speed-up and 2612× energy-saving compared to 2-D CPU implementation. In addition, 14.85× model compression can be achieved by tensorization with acceptable accuracy loss. NRF (Natl Research Foundation, S’pore) MOE (Min. of Education, S’pore) Accepted version 2018-07-25T04:45:00Z 2019-12-06T16:34:01Z 2018-07-25T04:45:00Z 2019-12-06T16:34:01Z 2018 Journal Article Huang, H., Ni, L., Wang, K., Wang, Y., & Yu, H. (2018). A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network. IEEE Transactions on Nanotechnology, 17(4), 645-656. 1536-125X https://hdl.handle.net/10356/87049 http://hdl.handle.net/10220/45222 10.1109/TNANO.2017.2732698 en IEEE Transactions on Nanotechnology © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/TNANO.2017.2732698]. 12 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Tensorized Neural Network (TNN) RRAM Computing |
spellingShingle |
Tensorized Neural Network (TNN) RRAM Computing Huang, Hantao Ni, Leibin Wang, Kanwen Wang, Yuangang Yu, Hao A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
description |
It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power in the proposed 3-D multilayer CMOS-RRAM accelerator. The adoption of tensorization can significantly compress the weight matrix of a neural network using much fewer parameters. Simulation results using the benchmark MNIST show that the proposed accelerator has 1.283× speed-up, 4.276× energy-saving, and 9.339× area-saving compared to the 3-D CMOS-ASIC implementation; and 6.37× speed-up and 2612× energy-saving compared to 2-D CPU implementation. In addition, 14.85× model compression can be achieved by tensorization with acceptable accuracy loss. |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Huang, Hantao Ni, Leibin Wang, Kanwen Wang, Yuangang Yu, Hao |
format |
Article |
author |
Huang, Hantao Ni, Leibin Wang, Kanwen Wang, Yuangang Yu, Hao |
author_sort |
Huang, Hantao |
title |
A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_short |
A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_full |
A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_fullStr |
A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_full_unstemmed |
A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_sort |
highly-parallel and energy-efficient 3d multi-layer cmos-rram accelerator for tensorized neural network |
publishDate |
2018 |
url |
https://hdl.handle.net/10356/87049 http://hdl.handle.net/10220/45222 |
_version_ |
1681042014594924544 |