Compact and fast machine learning accelerator for IoT devices
The Internet of things (IoT) is the networked interconnection of every object to provide intelligent service and improve economy benefit. The potential of IoT and its ubiquitous computation reality are staggering, but limited by many technical challenges. One challenge is to have a real-time respons...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/73823 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The Internet of things (IoT) is the networked interconnection of every object to provide intelligent service and improve economy benefit. The potential of IoT and its ubiquitous computation reality are staggering, but limited by many technical challenges. One challenge is to have a real-time response to the dynamic ambient change. Machine learning accelerator on IoT edge devices is one potential solution since a centralized system suffers long latency of processing in the back end. However, IoT edge devices are resource-constrained and machine learning algorithms are computational intensive. Therefore, optimized machine learning algorithms, such as compact machine learning for less memory usage on IoT devices, is greatly needed. In this thesis, we explore the development of fast and compact machine learning accelerators by developing least-squares solver, tensor-solver and distributed-solver. Moreover, applications such as energy management system using such machine learning solver on IoT devices are also investigated. From the fast machine learning perspective, the target is to perform fast learning on the neural network. This thesis proposes a least-squares-solver for single hidden layer neural network. Furthermore, this thesis explores the CMOS FPGA based hardware accelerator and RRAM based hardware accelerator. From the compact machine learning perspective, this thesis proposes a tensor-solver for deep neural network compression with consideration of the accuracy. A layer-wise training of tensorized neural network (TNN) has been proposed to formulate multilayer neural network such that the weight matrix can be significantly compressed during training. From the large scaled IoT networks perspective, this thesis proposes a distributed-solver on IoT devices. Furthermore, this thesis proposes a distributed neural network and sequential learning on the smart gateways for indoor positioning, energy management and IoT network security in IoT systems. |
---|