Hardware acceleration of neural networks with CMOS and post-CMOS devices

There is a huge need for embedded machine learning for portable devices and smart sensors to power the next generation of Internet of Things (IoT). Implementation of neural networks involve large number of arithmetic and memory operations. Realization of the arithmetic blocks with conventional digit...

Full description

Saved in:
Bibliographic Details
Main Author: Govind Narasimman
Other Authors: Arindam Basu
Format: Theses and Dissertations
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/72479
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-72479
record_format dspace
spelling sg-ntu-dr.10356-724792023-07-04T17:14:06Z Hardware acceleration of neural networks with CMOS and post-CMOS devices Govind Narasimman Arindam Basu School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering There is a huge need for embedded machine learning for portable devices and smart sensors to power the next generation of Internet of Things (IoT). Implementation of neural networks involve large number of arithmetic and memory operations. Realization of the arithmetic blocks with conventional digital circuits will inherit a trade-off between accuracy of calculation and area required. On the other hand, existing analog building blocks for neural networks suffer from inaccuracies related to process variation and large power consumption. Moreover, the efficiency of computation-memory interface is degrading since memory bandwidth increment is poor compared to computation throughput growth in CMOS technology. Though the recent large scale neuromorphic circuits have used localized random access memory for reducing memory operations, this local memory size will not be scalable with increasing size of datasets. Here, we explore novel CMOS and post-CMOS circuits to realize ultra-low power neuromorphic circuit by co-design of algorithm and hardware. The solutions also overcome the issue of increasing bandwidth gap between memory operations and computation. First, a deep neural network with 2 convolution layers and 2 fully connected layers, is chosen and tuned for hardware implementation. The network has few tunable parameters (_ 40000) and is 40 times faster in training than usual deep neural networks with 4 layers. We propose a compact, single transistor element for realizing the connections inside the neural network- called ’synapse’. These synapses perform the computations involved by virtue of mismatch inherent in their fabrication. We use a current mirror array with n-input lines and m output lines, to perform ’n x m’ Multiply and Accumulation operation. The resultant neuromorphic circuit can emulate multi-layered artificial vision system. A circuit fabricated in 0.35_m CMOS process is characterized and a behavioral model is simulated for the deep neural network. Here, the learning is done offline. The inputs to the network may vary according to environmental conditions, for which we need an adaptive neural network. Hence we propose a second solution where the neuromorphic circuit can adapt it’s parameters in real time. With the advent of novel nanoscale devices with physical properties well matched to neural network enabling computations at energies much lower than CMOS, the research also focuses on use of a post-CMOS spintronic device-a Domain-Wall magnet for obtaining a low power spike timing dependent plastic (STDP) synapse for online learning. The spin-mode signals are injected across small potential (_ 50mV) through multiple layers of ferromagnetic and non-magnetic layers. Here we discuss the implementation of a spiking neural network with synapses which can be trained according to STDP learning rule. A detailed study with the help of device circuit co-simulation is done. Possible use of this synapse in online, real time learning spiking neural networks is also illustrated in this thesis. Master of Engineering 2017-08-04T02:44:58Z 2017-08-04T02:44:58Z 2017 Thesis Govind Narasimman. (2017). Hardware acceleration of neural networks with CMOS and post-CMOS devices. Master's thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/72479 10.32657/10356/72479 en 97 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Govind Narasimman
Hardware acceleration of neural networks with CMOS and post-CMOS devices
description There is a huge need for embedded machine learning for portable devices and smart sensors to power the next generation of Internet of Things (IoT). Implementation of neural networks involve large number of arithmetic and memory operations. Realization of the arithmetic blocks with conventional digital circuits will inherit a trade-off between accuracy of calculation and area required. On the other hand, existing analog building blocks for neural networks suffer from inaccuracies related to process variation and large power consumption. Moreover, the efficiency of computation-memory interface is degrading since memory bandwidth increment is poor compared to computation throughput growth in CMOS technology. Though the recent large scale neuromorphic circuits have used localized random access memory for reducing memory operations, this local memory size will not be scalable with increasing size of datasets. Here, we explore novel CMOS and post-CMOS circuits to realize ultra-low power neuromorphic circuit by co-design of algorithm and hardware. The solutions also overcome the issue of increasing bandwidth gap between memory operations and computation. First, a deep neural network with 2 convolution layers and 2 fully connected layers, is chosen and tuned for hardware implementation. The network has few tunable parameters (_ 40000) and is 40 times faster in training than usual deep neural networks with 4 layers. We propose a compact, single transistor element for realizing the connections inside the neural network- called ’synapse’. These synapses perform the computations involved by virtue of mismatch inherent in their fabrication. We use a current mirror array with n-input lines and m output lines, to perform ’n x m’ Multiply and Accumulation operation. The resultant neuromorphic circuit can emulate multi-layered artificial vision system. A circuit fabricated in 0.35_m CMOS process is characterized and a behavioral model is simulated for the deep neural network. Here, the learning is done offline. The inputs to the network may vary according to environmental conditions, for which we need an adaptive neural network. Hence we propose a second solution where the neuromorphic circuit can adapt it’s parameters in real time. With the advent of novel nanoscale devices with physical properties well matched to neural network enabling computations at energies much lower than CMOS, the research also focuses on use of a post-CMOS spintronic device-a Domain-Wall magnet for obtaining a low power spike timing dependent plastic (STDP) synapse for online learning. The spin-mode signals are injected across small potential (_ 50mV) through multiple layers of ferromagnetic and non-magnetic layers. Here we discuss the implementation of a spiking neural network with synapses which can be trained according to STDP learning rule. A detailed study with the help of device circuit co-simulation is done. Possible use of this synapse in online, real time learning spiking neural networks is also illustrated in this thesis.
author2 Arindam Basu
author_facet Arindam Basu
Govind Narasimman
format Theses and Dissertations
author Govind Narasimman
author_sort Govind Narasimman
title Hardware acceleration of neural networks with CMOS and post-CMOS devices
title_short Hardware acceleration of neural networks with CMOS and post-CMOS devices
title_full Hardware acceleration of neural networks with CMOS and post-CMOS devices
title_fullStr Hardware acceleration of neural networks with CMOS and post-CMOS devices
title_full_unstemmed Hardware acceleration of neural networks with CMOS and post-CMOS devices
title_sort hardware acceleration of neural networks with cmos and post-cmos devices
publishDate 2017
url http://hdl.handle.net/10356/72479
_version_ 1772825893879676928