IMPLEMENTATION OF YOLOV5 AND MASK R-CNN OBJECT DETECTION MODEL IN REAL TIME MOBILE APPLICATION FOR LOW VISION USERS

Low vision is partial vision loss that can’t be corrected with glasses, contacts or surgery. It is not blindness who experience total vision loss. People who have low vision usually have blurry vision, therefore it is harder for them to do every day activities. Based on that problem statement, on...

Full description

Saved in:
Bibliographic Details
Main Author: Anindya Riyadi, Inka
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/69191
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:69191
spelling id-itb.:691912022-09-20T21:06:11ZIMPLEMENTATION OF YOLOV5 AND MASK R-CNN OBJECT DETECTION MODEL IN REAL TIME MOBILE APPLICATION FOR LOW VISION USERS Anindya Riyadi, Inka Indonesia Final Project object detection, transfer learning, Mask R-CNN, YOLOv5, mobile application INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/69191 Low vision is partial vision loss that can’t be corrected with glasses, contacts or surgery. It is not blindness who experience total vision loss. People who have low vision usually have blurry vision, therefore it is harder for them to do every day activities. Based on that problem statement, one of the latest applications of deep learning that can help low vision is object detection. However, several published mobile applications that helps low vision does not have capability to detect object by real time and none of it supports Bahasa Indonesia. Unfortunately, this caused the application can only be used for users who understands English. This research covers the implementation of Mask R-CNN and YOLOv5 using transfer learning method. Both models are trained but only one of the models will be implemented on the mobile application. The research will conduct experiment using various optimizer including SGD, Adam, AdamW when training the models to find which optimizer helps model to have the best performance. The result of the experiment is compared and analyzed in server. YOLOv5s – SGD have the highest mAP score with 0.39401 and YOLOv5s – AdamW have the fastest inference time. Both model with optimizer is further compared and analyzed in mobile. YOLOv5s – SGD inference time (930 ms) is faster than YOLOv5s – AdamW (3037 ms). Therefore, YOLOv5s – SGD is chosen as the best performing model and implemented on the mobile application. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Low vision is partial vision loss that can’t be corrected with glasses, contacts or surgery. It is not blindness who experience total vision loss. People who have low vision usually have blurry vision, therefore it is harder for them to do every day activities. Based on that problem statement, one of the latest applications of deep learning that can help low vision is object detection. However, several published mobile applications that helps low vision does not have capability to detect object by real time and none of it supports Bahasa Indonesia. Unfortunately, this caused the application can only be used for users who understands English. This research covers the implementation of Mask R-CNN and YOLOv5 using transfer learning method. Both models are trained but only one of the models will be implemented on the mobile application. The research will conduct experiment using various optimizer including SGD, Adam, AdamW when training the models to find which optimizer helps model to have the best performance. The result of the experiment is compared and analyzed in server. YOLOv5s – SGD have the highest mAP score with 0.39401 and YOLOv5s – AdamW have the fastest inference time. Both model with optimizer is further compared and analyzed in mobile. YOLOv5s – SGD inference time (930 ms) is faster than YOLOv5s – AdamW (3037 ms). Therefore, YOLOv5s – SGD is chosen as the best performing model and implemented on the mobile application.
format Final Project
author Anindya Riyadi, Inka
spellingShingle Anindya Riyadi, Inka
IMPLEMENTATION OF YOLOV5 AND MASK R-CNN OBJECT DETECTION MODEL IN REAL TIME MOBILE APPLICATION FOR LOW VISION USERS
author_facet Anindya Riyadi, Inka
author_sort Anindya Riyadi, Inka
title IMPLEMENTATION OF YOLOV5 AND MASK R-CNN OBJECT DETECTION MODEL IN REAL TIME MOBILE APPLICATION FOR LOW VISION USERS
title_short IMPLEMENTATION OF YOLOV5 AND MASK R-CNN OBJECT DETECTION MODEL IN REAL TIME MOBILE APPLICATION FOR LOW VISION USERS
title_full IMPLEMENTATION OF YOLOV5 AND MASK R-CNN OBJECT DETECTION MODEL IN REAL TIME MOBILE APPLICATION FOR LOW VISION USERS
title_fullStr IMPLEMENTATION OF YOLOV5 AND MASK R-CNN OBJECT DETECTION MODEL IN REAL TIME MOBILE APPLICATION FOR LOW VISION USERS
title_full_unstemmed IMPLEMENTATION OF YOLOV5 AND MASK R-CNN OBJECT DETECTION MODEL IN REAL TIME MOBILE APPLICATION FOR LOW VISION USERS
title_sort implementation of yolov5 and mask r-cnn object detection model in real time mobile application for low vision users
url https://digilib.itb.ac.id/gdl/view/69191
_version_ 1822990946072002560