Implementing machine learning algorithms on FPGA for edge computing

In recent years, with the development of high-performance computing devices, convolutional neural network (CNN) has become one of the most popular machine learning algorithms. It has achieved unprecedented success in various fields of application. However, despite its great performance, traditional...

Full description

Saved in:

Bibliographic Details
Main Author:	Chen, Zhuoran
Other Authors:	Lam Siew Kei
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2021
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation
Online Access:	https://hdl.handle.net/10356/148052
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	In recent years, with the development of high-performance computing devices, convolutional neural network (CNN) has become one of the most popular machine learning algorithms. It has achieved unprecedented success in various fields of application. However, despite its great performance, traditional graphic processing unit (GPU) based implementation of CNNs has the problems of high power consumption and low flexibility in deployment. Field-programmable gate array (FPGA) is a good alternative for CNN implementations. In this project, the famous LeNet-5 model is trained on GPUs and implemented on Xilinx FPGA platform for inference task. Different techniques are explored to reduce resource utilization and improve timing performance of the design. We adopt post-training quantization on the model and evaluate the results of different quantization bit width combinations. We also propose an iterative algorithm to determine the optimal solution on the trade-off between model accuracy and hardware performance. Using the proposed algorithm, the quantized model has an accuracy of 97.88% and with very low hardware utilization, its maximum clock frequency on Xilinx Virtex7 device is 67.84MHz.

Implementing machine learning algorithms on FPGA for edge computing

Similar Items