A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network

It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power...

Full description

Saved in:
Bibliographic Details
Main Authors: Huang, Hantao, Ni, Leibin, Wang, Kanwen, Wang, Yuangang, Yu, Hao
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2018
Subjects:
Online Access:https://hdl.handle.net/10356/87049
http://hdl.handle.net/10220/45222
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power in the proposed 3-D multilayer CMOS-RRAM accelerator. The adoption of tensorization can significantly compress the weight matrix of a neural network using much fewer parameters. Simulation results using the benchmark MNIST show that the proposed accelerator has 1.283× speed-up, 4.276× energy-saving, and 9.339× area-saving compared to the 3-D CMOS-ASIC implementation; and 6.37× speed-up and 2612× energy-saving compared to 2-D CPU implementation. In addition, 14.85× model compression can be achieved by tensorization with acceptable accuracy loss.