Feature representation and learning methods in visual search applications

This thesis studies and develops various image feature representation and learning methods for visual search applications. We study both handcrafted features as well as deep learning based representations. Handcrafted features based methods are light-weight and do not require large training data. Ho...

Full description

Saved in:
Bibliographic Details
Main Author: Manandhar, Dipu
Other Authors: Yap Kim Hui
Format: Theses and Dissertations
Language:English
Published: 2019
Subjects:
Online Access:https://hdl.handle.net/10356/81689
http://hdl.handle.net/10220/47995
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-81689
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Manandhar, Dipu
Feature representation and learning methods in visual search applications
description This thesis studies and develops various image feature representation and learning methods for visual search applications. We study both handcrafted features as well as deep learning based representations. Handcrafted features based methods are light-weight and do not require large training data. However, they have been outperformed by deep learning methods in various vision-related problems in recent years. Nevertheless, deep learning methods generally require large data and computational power. In view of this, this thesis will study both handcrafted methods as well as deep learning methods for two selected domains, namely, visual landmark search and visual fashion search and application. The first application develops algorithms for visual landmark search. The presence of repetitive patterns in landmark images causes visual burstiness issue which adversely affect the image representation. To tackle this, we propose a novel Lattice-Support Repetitive Local Feature Detection (LS-RLF) technique which first effectively detects repetitive patterns present in images and then uses the detection information during the image representation. As the repetitive pattern detection is early-vision problem and requires local features analysis, we develop our algorithms using handcrafted local feature-based representation. We also present a new Feature Repetitiveness Similarity (FRS) metric which quantize the repetitive and unique features independently and match them separately. The FRS metric makes use of information in repetitive patterns to enhance the search while avoiding the visual burstiness issue. Experiments conducted on three benchmark datasets namely, Oxford, Paris and Inria Holidays datasets show the effectiveness of the proposed methods. The second application studies feature representation methods for fashion images using deep learning. We collect a new fashion dataset, NTUBrandFashion (NBF) dataset, with 10K fashion images which are richly annotated with essential elements of fashion: categories, attributes, and brand. We propose a new brand-aware fashion search (BAFS) which takes user brand preference into account. This search method uses a deep feature encoding which leverages on hierarchies of CNN activations to extract rich visual representation from clothing images. The brand-aware re-ranking in BAFS framework further improve the search performance. We also propose a new Attribute-Supervised Metric Learning (ASML) method to learn discriminative embedding from clothing images. This deep metric learning based method incorporates image attribute information to supervise the triplet network training. This serves two purposes: (i) mining of informative triplets and (ii) treating the triplets in a soft-manner based on their importance, which helps in capturing similarity at different levels. Experiments conducted on NBF and DeepFashion datasets show the effectiveness of the proposed method. The third work studies methods for fashion trend analysis and popularity prediction based on visual analysis of clothing images. We develop visual representation methods for fashion images while capturing their trend information to predict their popularity in terms of clickrates. We propose an image-based model and a sequence-based model using deep networks to predict the clickrate of the fashion items. The image-based model uses CNN network to predict the clothing popularity. The sequence-based method uses time-sequence of clothing images which uses CNNs to extract visual features and RNN to model the trend information. To the best of our knowledge, this is the first work to explore visual information for fashion forecasting of individual items. Experiments conducted on a dataset obtained from an online fashion company show promising results for fashion forecasting which outperform the recent comparative method.
author2 Yap Kim Hui
author_facet Yap Kim Hui
Manandhar, Dipu
format Theses and Dissertations
author Manandhar, Dipu
author_sort Manandhar, Dipu
title Feature representation and learning methods in visual search applications
title_short Feature representation and learning methods in visual search applications
title_full Feature representation and learning methods in visual search applications
title_fullStr Feature representation and learning methods in visual search applications
title_full_unstemmed Feature representation and learning methods in visual search applications
title_sort feature representation and learning methods in visual search applications
publishDate 2019
url https://hdl.handle.net/10356/81689
http://hdl.handle.net/10220/47995
_version_ 1772827986337202176
spelling sg-ntu-dr.10356-816892023-07-04T16:31:45Z Feature representation and learning methods in visual search applications Manandhar, Dipu Yap Kim Hui School of Electrical and Electronic Engineering Rapid-Rich Object Search Lab DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision This thesis studies and develops various image feature representation and learning methods for visual search applications. We study both handcrafted features as well as deep learning based representations. Handcrafted features based methods are light-weight and do not require large training data. However, they have been outperformed by deep learning methods in various vision-related problems in recent years. Nevertheless, deep learning methods generally require large data and computational power. In view of this, this thesis will study both handcrafted methods as well as deep learning methods for two selected domains, namely, visual landmark search and visual fashion search and application. The first application develops algorithms for visual landmark search. The presence of repetitive patterns in landmark images causes visual burstiness issue which adversely affect the image representation. To tackle this, we propose a novel Lattice-Support Repetitive Local Feature Detection (LS-RLF) technique which first effectively detects repetitive patterns present in images and then uses the detection information during the image representation. As the repetitive pattern detection is early-vision problem and requires local features analysis, we develop our algorithms using handcrafted local feature-based representation. We also present a new Feature Repetitiveness Similarity (FRS) metric which quantize the repetitive and unique features independently and match them separately. The FRS metric makes use of information in repetitive patterns to enhance the search while avoiding the visual burstiness issue. Experiments conducted on three benchmark datasets namely, Oxford, Paris and Inria Holidays datasets show the effectiveness of the proposed methods. The second application studies feature representation methods for fashion images using deep learning. We collect a new fashion dataset, NTUBrandFashion (NBF) dataset, with 10K fashion images which are richly annotated with essential elements of fashion: categories, attributes, and brand. We propose a new brand-aware fashion search (BAFS) which takes user brand preference into account. This search method uses a deep feature encoding which leverages on hierarchies of CNN activations to extract rich visual representation from clothing images. The brand-aware re-ranking in BAFS framework further improve the search performance. We also propose a new Attribute-Supervised Metric Learning (ASML) method to learn discriminative embedding from clothing images. This deep metric learning based method incorporates image attribute information to supervise the triplet network training. This serves two purposes: (i) mining of informative triplets and (ii) treating the triplets in a soft-manner based on their importance, which helps in capturing similarity at different levels. Experiments conducted on NBF and DeepFashion datasets show the effectiveness of the proposed method. The third work studies methods for fashion trend analysis and popularity prediction based on visual analysis of clothing images. We develop visual representation methods for fashion images while capturing their trend information to predict their popularity in terms of clickrates. We propose an image-based model and a sequence-based model using deep networks to predict the clickrate of the fashion items. The image-based model uses CNN network to predict the clothing popularity. The sequence-based method uses time-sequence of clothing images which uses CNNs to extract visual features and RNN to model the trend information. To the best of our knowledge, this is the first work to explore visual information for fashion forecasting of individual items. Experiments conducted on a dataset obtained from an online fashion company show promising results for fashion forecasting which outperform the recent comparative method. Doctor of Philosophy 2019-04-08T05:27:36Z 2019-12-06T14:36:09Z 2019-04-08T05:27:36Z 2019-12-06T14:36:09Z 2019 Thesis Manandhar, D. (2019). Feature representation and learning methods in visual search applications. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/81689 http://hdl.handle.net/10220/47995 10.32657/10220/47995 en 146 p. application/pdf