Implementing machine learning algorithms on FPGA for edge computing

In recent years, with the development of high-performance computing devices, convolutional neural network (CNN) has become one of the most popular machine learning algorithms. It has achieved unprecedented success in various fields of application. However, despite its great performance, traditional...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Zhuoran
Other Authors: Lam Siew Kei
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148052
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-148052
record_format dspace
spelling sg-ntu-dr.10356-1480522021-04-22T07:29:55Z Implementing machine learning algorithms on FPGA for edge computing Chen, Zhuoran Lam Siew Kei School of Computer Science and Engineering ASSKLam@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation In recent years, with the development of high-performance computing devices, convolutional neural network (CNN) has become one of the most popular machine learning algorithms. It has achieved unprecedented success in various fields of application. However, despite its great performance, traditional graphic processing unit (GPU) based implementation of CNNs has the problems of high power consumption and low flexibility in deployment. Field-programmable gate array (FPGA) is a good alternative for CNN implementations. In this project, the famous LeNet-5 model is trained on GPUs and implemented on Xilinx FPGA platform for inference task. Different techniques are explored to reduce resource utilization and improve timing performance of the design. We adopt post-training quantization on the model and evaluate the results of different quantization bit width combinations. We also propose an iterative algorithm to determine the optimal solution on the trade-off between model accuracy and hardware performance. Using the proposed algorithm, the quantized model has an accuracy of 97.88% and with very low hardware utilization, its maximum clock frequency on Xilinx Virtex7 device is 67.84MHz. Bachelor of Engineering (Computer Engineering) 2021-04-22T07:29:55Z 2021-04-22T07:29:55Z 2021 Final Year Project (FYP) Chen, Z. (2021). Implementing machine learning algorithms on FPGA for edge computing. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148052 https://hdl.handle.net/10356/148052 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation
Chen, Zhuoran
Implementing machine learning algorithms on FPGA for edge computing
description In recent years, with the development of high-performance computing devices, convolutional neural network (CNN) has become one of the most popular machine learning algorithms. It has achieved unprecedented success in various fields of application. However, despite its great performance, traditional graphic processing unit (GPU) based implementation of CNNs has the problems of high power consumption and low flexibility in deployment. Field-programmable gate array (FPGA) is a good alternative for CNN implementations. In this project, the famous LeNet-5 model is trained on GPUs and implemented on Xilinx FPGA platform for inference task. Different techniques are explored to reduce resource utilization and improve timing performance of the design. We adopt post-training quantization on the model and evaluate the results of different quantization bit width combinations. We also propose an iterative algorithm to determine the optimal solution on the trade-off between model accuracy and hardware performance. Using the proposed algorithm, the quantized model has an accuracy of 97.88% and with very low hardware utilization, its maximum clock frequency on Xilinx Virtex7 device is 67.84MHz.
author2 Lam Siew Kei
author_facet Lam Siew Kei
Chen, Zhuoran
format Final Year Project
author Chen, Zhuoran
author_sort Chen, Zhuoran
title Implementing machine learning algorithms on FPGA for edge computing
title_short Implementing machine learning algorithms on FPGA for edge computing
title_full Implementing machine learning algorithms on FPGA for edge computing
title_fullStr Implementing machine learning algorithms on FPGA for edge computing
title_full_unstemmed Implementing machine learning algorithms on FPGA for edge computing
title_sort implementing machine learning algorithms on fpga for edge computing
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/148052
_version_ 1698713723252768768