On handbag recognition and recommendation

From Google to Pinterest, multimedia search engines such as Google Goggles deliver a wealth of visual information related to the search query. It recognizes and provides useful information when pointing the mobile phone camera at a business card, a book, a painting, a famous landmark, or a barcode....

Full description

Saved in:

Bibliographic Details
Main Author:	Wang, Yan
Other Authors:	Kot Chichung, Alex
Format:	Theses and Dissertations
Language:	English
Published:	2016
Subjects:	DRNTU::Engineering
Online Access:	https://hdl.handle.net/10356/69405
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-69405
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering
spellingShingle	DRNTU::Engineering Wang, Yan On handbag recognition and recommendation
description	From Google to Pinterest, multimedia search engines such as Google Goggles deliver a wealth of visual information related to the search query. It recognizes and provides useful information when pointing the mobile phone camera at a business card, a book, a painting, a famous landmark, or a barcode. Vision-based techniques try to perceive and understand images by learning from the ability of human vision. Developing such techniques remains an ongoing challenge for computers. Nowadays, multimedia systems for online advertising and commerce have a large market demand. Recent years' computer vision and multimedia communities have devoted efforts on many applications, such as fashion retrieval or recommendation for clothing, shoes, etc. Handbag has become a desirable fashion accessory, with six in ten consumers having purchased at least one new handbag in the year of 2014. Such market demand motivates the handbag recognition related vision products. However, this kind of product is still limited so far. As Google says, Goggles does not work well yet on things like food, plants, animals and some fashion items such as handbags. To develop such reliable recognition engines, we study handbag recognition and recommendation, which are key steps for building up a multimedia search system. The works in this thesis can be summarized as below. A style-to-color discriminative representation framework for handbag recognition is carried out at first. We identify the handbag model by conducting the style-based recognition and color-based recognition sequentially due to the visual characteristics of handbags. Experiments are conducted on our newly constructed handbag datasets. The experimental results illustrate that our method achieves over 10% improvement in accuracy for recognizing handbags when compared with existing fine-grained or generic object recognition methods. In recent years, Convolutional Neural Network (CNN) is promising for many image recognition tasks, which motivates us to design a handbag recognition algorithm based on CNN. However, after studying various CNN architectures for training the classifier, we find that the previous CNN models do not provide discriminative color information during training. Moreover, CNN models usually consider the hard label (i.e., the ground truth class label) to train a multiclass classifier. This is not sufficient especially for visually similar classes. In order to train a better CNN for classification, we present a Feature Selective joint Classification-Regression CNN (FSCR-CNN) model. It is helpful for recognizing color sensitive objects and it facilitates the classifier modeling for visually similar classes. Moreover, we propose an end-to-end handbag recognition framework. In this framework, we propose three components: (1) symmetry-based proposal localization, (2) CNN detection and FSCR-CNN classification, and (3) combination of detection scores and classification scores by conditional probability model. The experimental results verify the advantages of each component of our framework for handbag recognition. A handbag recommendation system for e-commerce and shops is also proposed. It can help shoppers to find desirable fashion items, which facilitates online interaction and product promotion. Given the images of the shopper's preferred handbags, the recommendation is performed by joint learning of attribute projection and one-class SVM classification. A weighted AutoEncoder method is further proposed to refine the recommended results. The results show that this scheme performs favorably based on the initial subject testing.
author2	Kot Chichung, Alex
author_facet	Kot Chichung, Alex Wang, Yan
format	Theses and Dissertations
author	Wang, Yan
author_sort	Wang, Yan
title	On handbag recognition and recommendation
title_short	On handbag recognition and recommendation
title_full	On handbag recognition and recommendation
title_fullStr	On handbag recognition and recommendation
title_full_unstemmed	On handbag recognition and recommendation
title_sort	on handbag recognition and recommendation
publishDate	2016
url	https://hdl.handle.net/10356/69405
_version_	1772825197776207872
spelling	sg-ntu-dr.10356-694052023-07-04T16:25:18Z On handbag recognition and recommendation Wang, Yan Kot Chichung, Alex School of Electrical and Electronic Engineering DRNTU::Engineering From Google to Pinterest, multimedia search engines such as Google Goggles deliver a wealth of visual information related to the search query. It recognizes and provides useful information when pointing the mobile phone camera at a business card, a book, a painting, a famous landmark, or a barcode. Vision-based techniques try to perceive and understand images by learning from the ability of human vision. Developing such techniques remains an ongoing challenge for computers. Nowadays, multimedia systems for online advertising and commerce have a large market demand. Recent years' computer vision and multimedia communities have devoted efforts on many applications, such as fashion retrieval or recommendation for clothing, shoes, etc. Handbag has become a desirable fashion accessory, with six in ten consumers having purchased at least one new handbag in the year of 2014. Such market demand motivates the handbag recognition related vision products. However, this kind of product is still limited so far. As Google says, Goggles does not work well yet on things like food, plants, animals and some fashion items such as handbags. To develop such reliable recognition engines, we study handbag recognition and recommendation, which are key steps for building up a multimedia search system. The works in this thesis can be summarized as below. A style-to-color discriminative representation framework for handbag recognition is carried out at first. We identify the handbag model by conducting the style-based recognition and color-based recognition sequentially due to the visual characteristics of handbags. Experiments are conducted on our newly constructed handbag datasets. The experimental results illustrate that our method achieves over 10% improvement in accuracy for recognizing handbags when compared with existing fine-grained or generic object recognition methods. In recent years, Convolutional Neural Network (CNN) is promising for many image recognition tasks, which motivates us to design a handbag recognition algorithm based on CNN. However, after studying various CNN architectures for training the classifier, we find that the previous CNN models do not provide discriminative color information during training. Moreover, CNN models usually consider the hard label (i.e., the ground truth class label) to train a multiclass classifier. This is not sufficient especially for visually similar classes. In order to train a better CNN for classification, we present a Feature Selective joint Classification-Regression CNN (FSCR-CNN) model. It is helpful for recognizing color sensitive objects and it facilitates the classifier modeling for visually similar classes. Moreover, we propose an end-to-end handbag recognition framework. In this framework, we propose three components: (1) symmetry-based proposal localization, (2) CNN detection and FSCR-CNN classification, and (3) combination of detection scores and classification scores by conditional probability model. The experimental results verify the advantages of each component of our framework for handbag recognition. A handbag recommendation system for e-commerce and shops is also proposed. It can help shoppers to find desirable fashion items, which facilitates online interaction and product promotion. Given the images of the shopper's preferred handbags, the recommendation is performed by joint learning of attribute projection and one-class SVM classification. A weighted AutoEncoder method is further proposed to refine the recommended results. The results show that this scheme performs favorably based on the initial subject testing. ELECTRICAL and ELECTRONIC ENGINEERING 2016-12-23T08:32:09Z 2016-12-23T08:32:09Z 2016 Thesis Wang, Y. (2016). On handbag recognition and recommendation. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/69405 10.32657/10356/69405 en 161 p. application/pdf

On handbag recognition and recommendation

Similar Items