DEVELOPMENT OF VISUAL SEARCH SYSTEM FOR E-COMMERCE PRODUCTS USING CONVOLUTIONAL NEURAL NETWORK

In the digital era where machine learning algorithms develop rapidly especially in the computer vision field escalate the development of visual search technology. Visual search as one of the rapidly developing technologies in computer vision during this decade has become a vital point for every e...

Full description

Saved in:
Bibliographic Details
Main Author: Faishol Huda, Ahmad
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/51423
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:In the digital era where machine learning algorithms develop rapidly especially in the computer vision field escalate the development of visual search technology. Visual search as one of the rapidly developing technologies in computer vision during this decade has become a vital point for every e-commerce company or any technology-based company. Visual search uses an image as a query instead of the traditional text one. It uses features of an object on an image as a comparison point when searching for similar images that contain the same object. Building a visual search model can be achieved by training the model from scratch and designing its architecture on your own or by using a transfer learning method which makes use of pretrained models. Some visual search models that its performance has been proven is VGG, GoogleNet, Resnet, and MobileNet used as base using transfer learning plus one model that we create from scratch. Metric evaluation that will be used in this experiment is category prediction accuracy, precision, and recall. After experimenting with the five models before, self-made model, VGG, Inception, Resnet, and MobilleNet, the results are 0.7160, 0.7084, 0.8254, 0.8640, 0.8111 for category prediction accuracy, 0.7394, 0.7299, 0.8603, 0.8912, 0.8171 for precision and 0.3147, 0.2989, 0.4721, 0.5779, 0.4278 for recall. From this result, we conclude that the best architecture is using the ResNet model as a base plus transfer learning method. This gives 0.8640 for category prediction accuracy, 0.8912 for precision and 0.5779 for recall. Some modules that can further improve visual search performance are dropout module, data augmentation module, and reindexing module.