Implementing machine learning algorithms on FPGA for edge computing
In recent years, with the development of high-performance computing devices, convolutional neural network (CNN) has become one of the most popular machine learning algorithms. It has achieved unprecedented success in various fields of application. However, despite its great performance, traditional...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/148052 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-148052 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1480522021-04-22T07:29:55Z Implementing machine learning algorithms on FPGA for edge computing Chen, Zhuoran Lam Siew Kei School of Computer Science and Engineering ASSKLam@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation In recent years, with the development of high-performance computing devices, convolutional neural network (CNN) has become one of the most popular machine learning algorithms. It has achieved unprecedented success in various fields of application. However, despite its great performance, traditional graphic processing unit (GPU) based implementation of CNNs has the problems of high power consumption and low flexibility in deployment. Field-programmable gate array (FPGA) is a good alternative for CNN implementations. In this project, the famous LeNet-5 model is trained on GPUs and implemented on Xilinx FPGA platform for inference task. Different techniques are explored to reduce resource utilization and improve timing performance of the design. We adopt post-training quantization on the model and evaluate the results of different quantization bit width combinations. We also propose an iterative algorithm to determine the optimal solution on the trade-off between model accuracy and hardware performance. Using the proposed algorithm, the quantized model has an accuracy of 97.88% and with very low hardware utilization, its maximum clock frequency on Xilinx Virtex7 device is 67.84MHz. Bachelor of Engineering (Computer Engineering) 2021-04-22T07:29:55Z 2021-04-22T07:29:55Z 2021 Final Year Project (FYP) Chen, Z. (2021). Implementing machine learning algorithms on FPGA for edge computing. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148052 https://hdl.handle.net/10356/148052 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation Chen, Zhuoran Implementing machine learning algorithms on FPGA for edge computing |
description |
In recent years, with the development of high-performance computing devices, convolutional neural network (CNN) has become one of the most popular machine learning algorithms. It has achieved unprecedented success in various fields of application. However, despite its great performance, traditional graphic processing unit (GPU) based implementation of CNNs has the problems of high power consumption and low flexibility in deployment. Field-programmable gate array (FPGA) is a good alternative for CNN implementations. In this project, the famous LeNet-5 model is trained on GPUs and implemented on Xilinx FPGA platform for inference task. Different techniques are explored to reduce resource utilization and improve timing performance of the design. We adopt post-training quantization on the model and evaluate the results of different quantization bit width combinations. We also propose an iterative algorithm to determine the optimal solution on the trade-off between model accuracy and hardware performance. Using the proposed algorithm, the quantized model has an accuracy of 97.88% and with very low hardware utilization, its maximum clock frequency on Xilinx Virtex7 device is 67.84MHz. |
author2 |
Lam Siew Kei |
author_facet |
Lam Siew Kei Chen, Zhuoran |
format |
Final Year Project |
author |
Chen, Zhuoran |
author_sort |
Chen, Zhuoran |
title |
Implementing machine learning algorithms on FPGA for edge computing |
title_short |
Implementing machine learning algorithms on FPGA for edge computing |
title_full |
Implementing machine learning algorithms on FPGA for edge computing |
title_fullStr |
Implementing machine learning algorithms on FPGA for edge computing |
title_full_unstemmed |
Implementing machine learning algorithms on FPGA for edge computing |
title_sort |
implementing machine learning algorithms on fpga for edge computing |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/148052 |
_version_ |
1698713723252768768 |