MACHINE LEARNING MODEL INFERENCE SYSTEM WITH NCNN ACCELERATOR IN MOBILE ENVIRONMENT USING WASTE SORTING CASE

In general, machine learning models will perform inference on devices that have high computing resources. This will be a problem if the purpose of deploying the model is to use a small, low-computing device. Mobile as a low computing device that is widely used directly becomes one of the goals of...

Full description

Saved in:
Bibliographic Details
Main Author: Syahid Syamsudin, Ilham
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/65808
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:65808
spelling id-itb.:658082022-06-25T02:41:36ZMACHINE LEARNING MODEL INFERENCE SYSTEM WITH NCNN ACCELERATOR IN MOBILE ENVIRONMENT USING WASTE SORTING CASE Syahid Syamsudin, Ilham Indonesia Final Project model inference system, yolox, sample frame, brain floating-point, ncnn, mobile. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/65808 In general, machine learning models will perform inference on devices that have high computing resources. This will be a problem if the purpose of deploying the model is to use a small, low-computing device. Mobile as a low computing device that is widely used directly becomes one of the goals of placing machine learning models. This study discusses the development of a system to optimize the process of running inference on mobile. These methods cover CPU and GPU usage, sample frames, brain floating-point (BF16), and NCNN accelerators and the impact of these methods on inference time, frames per second (FPS), and performance on Android mobiles. To be able to provide an overview of the system created, a case study was conducted on waste sorting. The model used is based on YOLOX with tiny and nano versions. In addition to efficiency, the use of machine learning in the case of waste sorting can trigger the growth and development of technology as well as raise awareness about disposing of waste. From the test results, it was found that the higher the sample frame, the higher the FPS generated. However, the use of sample frames prevents some frames from being interfered with. The best configuration is on a CPU with a sample rate of 10 using YOLOX-nano producing 15.30 FPS. In the memory test, YOLOX-tiny has higher memory usage than the YOLOX-nano model. It was also found that GPU usage during inference was not optimal, this was due to a fallback from machine learning operations that were not yet supported. In addition, the BF16 data type has a significant influence on the inference process by accelerating about 48%. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description In general, machine learning models will perform inference on devices that have high computing resources. This will be a problem if the purpose of deploying the model is to use a small, low-computing device. Mobile as a low computing device that is widely used directly becomes one of the goals of placing machine learning models. This study discusses the development of a system to optimize the process of running inference on mobile. These methods cover CPU and GPU usage, sample frames, brain floating-point (BF16), and NCNN accelerators and the impact of these methods on inference time, frames per second (FPS), and performance on Android mobiles. To be able to provide an overview of the system created, a case study was conducted on waste sorting. The model used is based on YOLOX with tiny and nano versions. In addition to efficiency, the use of machine learning in the case of waste sorting can trigger the growth and development of technology as well as raise awareness about disposing of waste. From the test results, it was found that the higher the sample frame, the higher the FPS generated. However, the use of sample frames prevents some frames from being interfered with. The best configuration is on a CPU with a sample rate of 10 using YOLOX-nano producing 15.30 FPS. In the memory test, YOLOX-tiny has higher memory usage than the YOLOX-nano model. It was also found that GPU usage during inference was not optimal, this was due to a fallback from machine learning operations that were not yet supported. In addition, the BF16 data type has a significant influence on the inference process by accelerating about 48%.
format Final Project
author Syahid Syamsudin, Ilham
spellingShingle Syahid Syamsudin, Ilham
MACHINE LEARNING MODEL INFERENCE SYSTEM WITH NCNN ACCELERATOR IN MOBILE ENVIRONMENT USING WASTE SORTING CASE
author_facet Syahid Syamsudin, Ilham
author_sort Syahid Syamsudin, Ilham
title MACHINE LEARNING MODEL INFERENCE SYSTEM WITH NCNN ACCELERATOR IN MOBILE ENVIRONMENT USING WASTE SORTING CASE
title_short MACHINE LEARNING MODEL INFERENCE SYSTEM WITH NCNN ACCELERATOR IN MOBILE ENVIRONMENT USING WASTE SORTING CASE
title_full MACHINE LEARNING MODEL INFERENCE SYSTEM WITH NCNN ACCELERATOR IN MOBILE ENVIRONMENT USING WASTE SORTING CASE
title_fullStr MACHINE LEARNING MODEL INFERENCE SYSTEM WITH NCNN ACCELERATOR IN MOBILE ENVIRONMENT USING WASTE SORTING CASE
title_full_unstemmed MACHINE LEARNING MODEL INFERENCE SYSTEM WITH NCNN ACCELERATOR IN MOBILE ENVIRONMENT USING WASTE SORTING CASE
title_sort machine learning model inference system with ncnn accelerator in mobile environment using waste sorting case
url https://digilib.itb.ac.id/gdl/view/65808
_version_ 1822277434070794240