FASTER R-CNN AND SSDLITE MODEL IMPLEMENTATION FOR OBJECT DETECTION APPLICATION FOR LOW VISION USERS

Low vision is a condition where a person's visual function decreases permanently, it makes the sufferer difficult to carry out daily activities. One of the problems felt by people with low vision is the difficulty in finding objects. Mobile applications with object detection features are bui...

Full description

Saved in:
Bibliographic Details
Main Author: Azharyanti, Chandrika
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/69186
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:69186
spelling id-itb.:691862022-09-20T20:58:44ZFASTER R-CNN AND SSDLITE MODEL IMPLEMENTATION FOR OBJECT DETECTION APPLICATION FOR LOW VISION USERS Azharyanti, Chandrika Indonesia Final Project Object Detection, Faster R-CNN, SSDLite, transfer learning. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/69186 Low vision is a condition where a person's visual function decreases permanently, it makes the sufferer difficult to carry out daily activities. One of the problems felt by people with low vision is the difficulty in finding objects. Mobile applications with object detection features are built to help people with disabilities solve these problems. To produce a detection object model that fits the needs, two types of frameworks are used, namely Faster R-CNN and SSDLite for re-training using transfer learning techniques. Transfer learning is the reuse of a previously trained model on a new problem. Since the application was built to help people with low vision detect everyday objects, the selection was made for the training dataset. The dataset used is MS COCO and after selecting the suitable class, 38 of the 80 available classes are used for the training process. The training model process uses 3 different optimizers (SGD, Adam, and AdamW) for each framework and the best configuration will be chosen to be implemented in the application. The results of the transfer learning model are then assessed using the mAP (Mean Average Precision) metric. Based on the analysis of the inference results on the training server, among the 6 configurations of the model and optimizer, there are 2 configurations with the highest mAP values, namely the Faster R-CNN model with the SGD optimizer with a value of 61.3% and the SSDLite model with the AdamW optimizer with a value of 37.3%. The inference times of the two are respectively , 0.75 and 0.095 second. The inference process is then carried out again on the application server to ensure the model inference time can still be tolerated on different server specification, resulting in both models still having a tolerable inference time, which is under 5 seconds. Therefore, the configuration of the Faster R-CNN model with the SGD optimizer has been chosen which has the highest mAP value of 61.3% and an inference time of 3.6 seconds as the object detection model used in online mode. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Low vision is a condition where a person's visual function decreases permanently, it makes the sufferer difficult to carry out daily activities. One of the problems felt by people with low vision is the difficulty in finding objects. Mobile applications with object detection features are built to help people with disabilities solve these problems. To produce a detection object model that fits the needs, two types of frameworks are used, namely Faster R-CNN and SSDLite for re-training using transfer learning techniques. Transfer learning is the reuse of a previously trained model on a new problem. Since the application was built to help people with low vision detect everyday objects, the selection was made for the training dataset. The dataset used is MS COCO and after selecting the suitable class, 38 of the 80 available classes are used for the training process. The training model process uses 3 different optimizers (SGD, Adam, and AdamW) for each framework and the best configuration will be chosen to be implemented in the application. The results of the transfer learning model are then assessed using the mAP (Mean Average Precision) metric. Based on the analysis of the inference results on the training server, among the 6 configurations of the model and optimizer, there are 2 configurations with the highest mAP values, namely the Faster R-CNN model with the SGD optimizer with a value of 61.3% and the SSDLite model with the AdamW optimizer with a value of 37.3%. The inference times of the two are respectively , 0.75 and 0.095 second. The inference process is then carried out again on the application server to ensure the model inference time can still be tolerated on different server specification, resulting in both models still having a tolerable inference time, which is under 5 seconds. Therefore, the configuration of the Faster R-CNN model with the SGD optimizer has been chosen which has the highest mAP value of 61.3% and an inference time of 3.6 seconds as the object detection model used in online mode.
format Final Project
author Azharyanti, Chandrika
spellingShingle Azharyanti, Chandrika
FASTER R-CNN AND SSDLITE MODEL IMPLEMENTATION FOR OBJECT DETECTION APPLICATION FOR LOW VISION USERS
author_facet Azharyanti, Chandrika
author_sort Azharyanti, Chandrika
title FASTER R-CNN AND SSDLITE MODEL IMPLEMENTATION FOR OBJECT DETECTION APPLICATION FOR LOW VISION USERS
title_short FASTER R-CNN AND SSDLITE MODEL IMPLEMENTATION FOR OBJECT DETECTION APPLICATION FOR LOW VISION USERS
title_full FASTER R-CNN AND SSDLITE MODEL IMPLEMENTATION FOR OBJECT DETECTION APPLICATION FOR LOW VISION USERS
title_fullStr FASTER R-CNN AND SSDLITE MODEL IMPLEMENTATION FOR OBJECT DETECTION APPLICATION FOR LOW VISION USERS
title_full_unstemmed FASTER R-CNN AND SSDLITE MODEL IMPLEMENTATION FOR OBJECT DETECTION APPLICATION FOR LOW VISION USERS
title_sort faster r-cnn and ssdlite model implementation for object detection application for low vision users
url https://digilib.itb.ac.id/gdl/view/69186
_version_ 1822005970435309568