Computer vision optimization on embedded GPU board

Computer vision tasks such as image classification have prevalent use and are greatly aided by the development of deep learning techniques, in particular CNN. Performing such tasks on specialized embedded GPU boards can have intriguing prospects in edge computing development. In this study, popular...

全面介紹

Saved in:
書目詳細資料
主要作者: Li, Ziyang
其他作者: Vun Chan Hua, Nicholas
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2022
主題:
在線閱讀:https://hdl.handle.net/10356/156654
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:Computer vision tasks such as image classification have prevalent use and are greatly aided by the development of deep learning techniques, in particular CNN. Performing such tasks on specialized embedded GPU boards can have intriguing prospects in edge computing development. In this study, popular CNN model architectures including GoogLeNet, ResNet and VGG were implemented on the new Jetson Xavier NX Developer Kit. The models are implemented using different deep learning frameworks including PyTorch, TensorFlow and Caffe, the latter involving TensorRT, the Nvidia optimization tool for inference model. The model implementations were evaluated based on various metrics including timing and resource utilization and the results were compared. This study draws the conclusion that DL-based computer vision tasks are compute-bound even on more powerful GPU devices, and the choice of frameworks has a significant effect on the performance of the inference task. In particular, TensorRT produces very significant improvement in terms of inference timing, and scales well across model architecture and model depth.