Implementation of a convolutional neural network on FPGA

Neural network computing has attracted a lot of attention as it borrows the concept of human brain to achieve artificial intelligence (AI). AI has been used in many applications such as energy estimation and battery management in smart grid, and image/audio recognition in intelligent robots. The lim...

Full description

Saved in:
Bibliographic Details
Main Author: Xue, Can
Other Authors: Goh Wang Ling
Format: Theses and Dissertations
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/73322
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Neural network computing has attracted a lot of attention as it borrows the concept of human brain to achieve artificial intelligence (AI). AI has been used in many applications such as energy estimation and battery management in smart grid, and image/audio recognition in intelligent robots. The limitation of software-based neural network computing include low speed and high power consumption, making it difficult to be used in real-time and portable applications especially when the required accuracy is high. To address this issue, hardware based neural network computation have been investigated. However, the challenge is to map different neural network topologies to a hardware accelerator platform with fixed processing engine array in an efficient way so that the latency and energy consumption is minimized. Convolutional neural network (CNN) is highly complex. In this dissertation, optimal mapping of different neural network topologies to FPGA (Field-Programmable Gate Array) will be investigated. The main work is to study the neural network topologies, the processing engine array based on FPGA and a methodology to map a specific network topology to the platform. This will include the mapping of computation, data transfer and control part. The main optimization objects include latency and energy efficiency. Some research on training will be introduced, which can achieve by using software Matlab. Training process includes supervised and unsupervised learning. Another focus is the implementation of process engine architecture. Some architectures have been proposed and one preferred architecture was implemented on the hardware platform. In this project, one layer structure is achieved and simulation results by Vivado will be analysed.