A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network

It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power...

Full description

Saved in:
Bibliographic Details
Main Authors: Huang, Hantao, Ni, Leibin, Wang, Kanwen, Wang, Yuangang, Yu, Hao
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2018
Subjects:
Online Access:https://hdl.handle.net/10356/87049
http://hdl.handle.net/10220/45222
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-87049
record_format dspace
spelling sg-ntu-dr.10356-870492020-03-07T13:56:07Z A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network Huang, Hantao Ni, Leibin Wang, Kanwen Wang, Yuangang Yu, Hao School of Electrical and Electronic Engineering Tensorized Neural Network (TNN) RRAM Computing It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power in the proposed 3-D multilayer CMOS-RRAM accelerator. The adoption of tensorization can significantly compress the weight matrix of a neural network using much fewer parameters. Simulation results using the benchmark MNIST show that the proposed accelerator has 1.283× speed-up, 4.276× energy-saving, and 9.339× area-saving compared to the 3-D CMOS-ASIC implementation; and 6.37× speed-up and 2612× energy-saving compared to 2-D CPU implementation. In addition, 14.85× model compression can be achieved by tensorization with acceptable accuracy loss. NRF (Natl Research Foundation, S’pore) MOE (Min. of Education, S’pore) Accepted version 2018-07-25T04:45:00Z 2019-12-06T16:34:01Z 2018-07-25T04:45:00Z 2019-12-06T16:34:01Z 2018 Journal Article Huang, H., Ni, L., Wang, K., Wang, Y., & Yu, H. (2018). A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network. IEEE Transactions on Nanotechnology, 17(4), 645-656. 1536-125X https://hdl.handle.net/10356/87049 http://hdl.handle.net/10220/45222 10.1109/TNANO.2017.2732698 en IEEE Transactions on Nanotechnology © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/TNANO.2017.2732698]. 12 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Tensorized Neural Network (TNN)
RRAM Computing
spellingShingle Tensorized Neural Network (TNN)
RRAM Computing
Huang, Hantao
Ni, Leibin
Wang, Kanwen
Wang, Yuangang
Yu, Hao
A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network
description It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power in the proposed 3-D multilayer CMOS-RRAM accelerator. The adoption of tensorization can significantly compress the weight matrix of a neural network using much fewer parameters. Simulation results using the benchmark MNIST show that the proposed accelerator has 1.283× speed-up, 4.276× energy-saving, and 9.339× area-saving compared to the 3-D CMOS-ASIC implementation; and 6.37× speed-up and 2612× energy-saving compared to 2-D CPU implementation. In addition, 14.85× model compression can be achieved by tensorization with acceptable accuracy loss.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Huang, Hantao
Ni, Leibin
Wang, Kanwen
Wang, Yuangang
Yu, Hao
format Article
author Huang, Hantao
Ni, Leibin
Wang, Kanwen
Wang, Yuangang
Yu, Hao
author_sort Huang, Hantao
title A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network
title_short A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network
title_full A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network
title_fullStr A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network
title_full_unstemmed A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network
title_sort highly-parallel and energy-efficient 3d multi-layer cmos-rram accelerator for tensorized neural network
publishDate 2018
url https://hdl.handle.net/10356/87049
http://hdl.handle.net/10220/45222
_version_ 1681042014594924544