Exploration of tiny neural networks for IoT applications

This project uses Verilog to design and implement a neural network with region-of-interest (ROI) and Convolution Neural Network (CNN) in FPGA. Multiple performance aspects, such as speed, accuracy, and expected power consumption, are being analyzed and compared. This project aims to explore multiple...

Full description

Saved in:
Bibliographic Details
Main Author: Murtadla, Bhara Sina
Other Authors: Kim Tae Hyoung
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/167806
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This project uses Verilog to design and implement a neural network with region-of-interest (ROI) and Convolution Neural Network (CNN) in FPGA. Multiple performance aspects, such as speed, accuracy, and expected power consumption, are being analyzed and compared. This project aims to explore multiple neural networks for hand recognition and implement them into FPGA with multiple optimization and adaptation. Neural networks are optimized with Multiple Instruction Multiple Data (MIMD) with respective parallelism to boost the processing time of each neural network. The FPGA-implemented neural network with ROI in this project achieved 99.33% accuracy, which has the same accuracy as the software model. Multiple image processing methods are introduced into the neural network model, including skin segmentation, morphological image filtering, and a searching algorithm, bread first search. Moreover, the processing time required by a hardware inference of a neural network with ROI was 12.5 ms on a 10-MHz clock. On the other hand, the proposed CNN model achieved an accuracy of 94.915% with maintaining accuracy upon implementing hardware architecture. An 8-bit quantization configuration from FBGMM (Facebook General Matrix Multiplication) is introduced and applied to the proposed CNN parameters before implementing the CNN on FPGA. The memory consumption of this hardware model is reduced by four times and lower expected power due to the reduction of parameters and calculation bit usage from 32-bit to 8-bit. Furthermore, the processing time required by hardware inference of CNN was 6.5 ms on a 10-MHz clock. Hence, this project has shown the advantages, drawbacks, and potential of the hardware implementation of both neural networks.