Ultra-low power real-ime object detection based on quantized CNNs

With the recent proliferation of deep learning-based solutions to object detection, the state-of-the-art accuracy has been increasing far beyond what was achievable using traditional methods. However, the hardware requirements for running these models in real-time are high, so they are expensive to...

全面介紹

Saved in:
書目詳細資料
主要作者: Chew, Jing Wei
其他作者: Weichen Liu
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2021
主題:
在線閱讀:https://hdl.handle.net/10356/148048
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:With the recent proliferation of deep learning-based solutions to object detection, the state-of-the-art accuracy has been increasing far beyond what was achievable using traditional methods. However, the hardware requirements for running these models in real-time are high, so they are expensive to deploy on the edge. Furthermore, due to their large model size, their memory footprint is unnecessarily high, and this also leads to excessive power consumption which makes them unfeasible for deployment on resource-constrained environments with no constant power source. Therefore, this project proposes the use of the most extreme network quantization possible, i.e. binarization, to make a YOLO-based object detection model deployable on the edge, while attaining reasonable accuracy. Using this approach, the proposed model can run at 37.7 FPS on an NVIDIA Jetson Nano with a peak memory footprint of 17.1 MB, while attaining a reasonable mAP@0.50 Intersection over Union (IoU) of 0.37 on the Pascal Visual Object Classes (VOC) dataset. Furthermore, these figures signify a speedup of 21.8x and a memory usage reduction by a factor of 15.3x compared to a similar YOLOv2 full-precision model architecture. Since computation was completely performed on the CPU, the use of TensorRT delegates or any other embedded hardware accelerator can allow for larger models with higher accuracies to be deployed in future works. The full project is open-sourced and can be found in https://github.com/tehtea/QuickYOLO.