DEVELOPMENT OF VISUAL SEARCH SYSTEM FOR E-COMMERCE PRODUCTS USING CONVOLUTIONAL NEURAL NETWORK
In the digital era where machine learning algorithms develop rapidly especially in the computer vision field escalate the development of visual search technology. Visual search as one of the rapidly developing technologies in computer vision during this decade has become a vital point for every e...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/51423 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | In the digital era where machine learning algorithms develop rapidly especially in the computer
vision field escalate the development of visual search technology. Visual search as one of the
rapidly developing technologies in computer vision during this decade has become a vital point
for every e-commerce company or any technology-based company. Visual search uses an
image as a query instead of the traditional text one. It uses features of an object on an image as
a comparison point when searching for similar images that contain the same object. Building
a visual search model can be achieved by training the model from scratch and designing its
architecture on your own or by using a transfer learning method which makes use of pretrained
models. Some visual search models that its performance has been proven is VGG, GoogleNet,
Resnet, and MobileNet used as base using transfer learning plus one model that we create from
scratch. Metric evaluation that will be used in this experiment is category prediction accuracy,
precision, and recall. After experimenting with the five models before, self-made model, VGG,
Inception, Resnet, and MobilleNet, the results are 0.7160, 0.7084, 0.8254, 0.8640, 0.8111 for
category prediction accuracy, 0.7394, 0.7299, 0.8603, 0.8912, 0.8171 for precision and 0.3147,
0.2989, 0.4721, 0.5779, 0.4278 for recall. From this result, we conclude that the best
architecture is using the ResNet model as a base plus transfer learning method. This gives
0.8640 for category prediction accuracy, 0.8912 for precision and 0.5779 for recall. Some
modules that can further improve visual search performance are dropout module, data
augmentation module, and reindexing module. |
---|