Ultra-low power real-ime object detection based on quantized CNNs
With the recent proliferation of deep learning-based solutions to object detection, the state-of-the-art accuracy has been increasing far beyond what was achievable using traditional methods. However, the hardware requirements for running these models in real-time are high, so they are expensive to...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/148048 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-148048 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1480482021-04-22T07:01:41Z Ultra-low power real-ime object detection based on quantized CNNs Chew, Jing Wei Weichen Liu School of Computer Science and Engineering liu@ntu.edu.sg Engineering::Computer science and engineering With the recent proliferation of deep learning-based solutions to object detection, the state-of-the-art accuracy has been increasing far beyond what was achievable using traditional methods. However, the hardware requirements for running these models in real-time are high, so they are expensive to deploy on the edge. Furthermore, due to their large model size, their memory footprint is unnecessarily high, and this also leads to excessive power consumption which makes them unfeasible for deployment on resource-constrained environments with no constant power source. Therefore, this project proposes the use of the most extreme network quantization possible, i.e. binarization, to make a YOLO-based object detection model deployable on the edge, while attaining reasonable accuracy. Using this approach, the proposed model can run at 37.7 FPS on an NVIDIA Jetson Nano with a peak memory footprint of 17.1 MB, while attaining a reasonable mAP@0.50 Intersection over Union (IoU) of 0.37 on the Pascal Visual Object Classes (VOC) dataset. Furthermore, these figures signify a speedup of 21.8x and a memory usage reduction by a factor of 15.3x compared to a similar YOLOv2 full-precision model architecture. Since computation was completely performed on the CPU, the use of TensorRT delegates or any other embedded hardware accelerator can allow for larger models with higher accuracies to be deployed in future works. The full project is open-sourced and can be found in https://github.com/tehtea/QuickYOLO. Bachelor of Engineering (Computer Science) 2021-04-22T07:01:41Z 2021-04-22T07:01:41Z 2021 Final Year Project (FYP) Chew, J. W. (2021). Ultra-low power real-ime object detection based on quantized CNNs. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148048 https://hdl.handle.net/10356/148048 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering |
spellingShingle |
Engineering::Computer science and engineering Chew, Jing Wei Ultra-low power real-ime object detection based on quantized CNNs |
description |
With the recent proliferation of deep learning-based solutions to object detection, the state-of-the-art accuracy has been increasing far beyond what was achievable using traditional methods. However, the hardware requirements for running these models in real-time are high, so they are expensive to deploy on the edge. Furthermore, due to their large model size, their memory footprint is unnecessarily high, and this also leads to excessive power consumption which makes them unfeasible for deployment on resource-constrained environments with no constant power source.
Therefore, this project proposes the use of the most extreme network quantization possible, i.e. binarization, to make a YOLO-based object detection model deployable on the edge, while attaining reasonable accuracy. Using this approach, the proposed model can run at 37.7 FPS on an NVIDIA Jetson Nano with a peak memory footprint of 17.1 MB, while attaining a reasonable mAP@0.50 Intersection over Union (IoU) of 0.37 on the Pascal Visual Object Classes (VOC) dataset. Furthermore, these figures signify a speedup of 21.8x and a memory usage reduction by a factor of 15.3x compared to a similar YOLOv2 full-precision model architecture. Since computation was completely performed on the CPU, the use of TensorRT delegates or any other embedded hardware accelerator can allow for larger models with higher accuracies to be deployed in future works.
The full project is open-sourced and can be found in https://github.com/tehtea/QuickYOLO. |
author2 |
Weichen Liu |
author_facet |
Weichen Liu Chew, Jing Wei |
format |
Final Year Project |
author |
Chew, Jing Wei |
author_sort |
Chew, Jing Wei |
title |
Ultra-low power real-ime object detection based on quantized CNNs |
title_short |
Ultra-low power real-ime object detection based on quantized CNNs |
title_full |
Ultra-low power real-ime object detection based on quantized CNNs |
title_fullStr |
Ultra-low power real-ime object detection based on quantized CNNs |
title_full_unstemmed |
Ultra-low power real-ime object detection based on quantized CNNs |
title_sort |
ultra-low power real-ime object detection based on quantized cnns |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/148048 |
_version_ |
1698713740244942848 |