Visual recognition by learning from web data

With the rapid development of digital cameras, we have witnessed an explosive growth of digital images. Every day, a tremendous amount of images together with rich contextual information (e.g., tags, categories and captions) are posted to Internet. There is an increasing interest in exploiting those...

Full description

Saved in:

Bibliographic Details
Main Author:	Li, Wen
Other Authors:	Xu Dong
Format:	Theses and Dissertations
Language:	English
Published:	2015
Subjects:	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Online Access:	https://hdl.handle.net/10356/62186
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-62186
record_format	dspace
spelling	sg-ntu-dr.10356-621862023-03-04T00:46:19Z Visual recognition by learning from web data Li, Wen Xu Dong School of Computer Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition With the rapid development of digital cameras, we have witnessed an explosive growth of digital images. Every day, a tremendous amount of images together with rich contextual information (e.g., tags, categories and captions) are posted to Internet. There is an increasing interest in exploiting those web images for building intelligent visual recognition systems. While some works have been proposed to collect large scale image datasets by crawling web images from Internet, considerable human efforts are still required to annotate those images to train classifiers for visual recognition. In this thesis, we propose to develop novel learning algorithms for visual recognition by learning from web data, in which we aim to use as less as possible human efforts for annotating the training data. First, considering that web images are usually associated with noisy surrounding textual descriptions, we treat the words in the surrounding text as weak labels and formulate the task of learning from web data as a multi-instance learning (MIL) problem. By observing the relevant images usually contain many true positive images, we generalize the traditional MIL constraints on positive bags to that each positive bag contains at least a portion of positive instances. To effectively exploit such constraints on positive bags, we develop a new MIL algorithm called MIL with constrained positive bags (MIL-CPB) for web image retrieval. Observing that the constraints are not always satisfied in the MIL-CPB, a progressive scheme is proposed to further improve the retrieval performance, in which we iteratively partition the top-ranked training web images from the current MIL-CPB classifier to construct more confident positive bags and then use these new bags as training data to learn the subsequent MIL-CPB classifiers. Second, when the web training data is represented with multiple views of features, we further propose a co-labeling approach to improve the classifiers learnt from web data by using multiple views of features. We model the learning problem on each view as a weakly labeled learning problem, and use the predicted training labels from the classifier trained on one view to help the classifier on another view. Our co-labeling approach not only can handle the traditional multi-view semi-supervised learning problem, but also can be applied to other multi-view weakly labeled learning problem such as multi-view MIL. Finally, we observe that there are intrinsic differences between the crawled web training data and the testing images in our daily lives, which is also known as the domain adaptation problem. Particularly, we study the heterogeneous domain adaptation problem, in which the samples in source and target domains are with different feature representations. We build upon the recent Heterogeneous Feature Augmentation (HFA) method, and propose a convex reformulation of HFA, which can guarantee the global optimal solution. We further extend the HFA method to semi-supervised HFA (SHFA), in which we improve the learnt classifiers by exploiting the additional unlabeled data from the target domain. For all our proposed approaches, we conduct extensive experiments on publicly available datasets to demonstrate their effectiveness. DOCTOR OF PHILOSOPHY (SCE) 2015-02-25T03:15:39Z 2015-02-25T03:15:39Z 2014 2014 Thesis Li, W. (2014). Visual recognition by learning from web data. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/62186 10.32657/10356/62186 en 152 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition Li, Wen Visual recognition by learning from web data
description	With the rapid development of digital cameras, we have witnessed an explosive growth of digital images. Every day, a tremendous amount of images together with rich contextual information (e.g., tags, categories and captions) are posted to Internet. There is an increasing interest in exploiting those web images for building intelligent visual recognition systems. While some works have been proposed to collect large scale image datasets by crawling web images from Internet, considerable human efforts are still required to annotate those images to train classifiers for visual recognition. In this thesis, we propose to develop novel learning algorithms for visual recognition by learning from web data, in which we aim to use as less as possible human efforts for annotating the training data. First, considering that web images are usually associated with noisy surrounding textual descriptions, we treat the words in the surrounding text as weak labels and formulate the task of learning from web data as a multi-instance learning (MIL) problem. By observing the relevant images usually contain many true positive images, we generalize the traditional MIL constraints on positive bags to that each positive bag contains at least a portion of positive instances. To effectively exploit such constraints on positive bags, we develop a new MIL algorithm called MIL with constrained positive bags (MIL-CPB) for web image retrieval. Observing that the constraints are not always satisfied in the MIL-CPB, a progressive scheme is proposed to further improve the retrieval performance, in which we iteratively partition the top-ranked training web images from the current MIL-CPB classifier to construct more confident positive bags and then use these new bags as training data to learn the subsequent MIL-CPB classifiers. Second, when the web training data is represented with multiple views of features, we further propose a co-labeling approach to improve the classifiers learnt from web data by using multiple views of features. We model the learning problem on each view as a weakly labeled learning problem, and use the predicted training labels from the classifier trained on one view to help the classifier on another view. Our co-labeling approach not only can handle the traditional multi-view semi-supervised learning problem, but also can be applied to other multi-view weakly labeled learning problem such as multi-view MIL. Finally, we observe that there are intrinsic differences between the crawled web training data and the testing images in our daily lives, which is also known as the domain adaptation problem. Particularly, we study the heterogeneous domain adaptation problem, in which the samples in source and target domains are with different feature representations. We build upon the recent Heterogeneous Feature Augmentation (HFA) method, and propose a convex reformulation of HFA, which can guarantee the global optimal solution. We further extend the HFA method to semi-supervised HFA (SHFA), in which we improve the learnt classifiers by exploiting the additional unlabeled data from the target domain. For all our proposed approaches, we conduct extensive experiments on publicly available datasets to demonstrate their effectiveness.
author2	Xu Dong
author_facet	Xu Dong Li, Wen
format	Theses and Dissertations
author	Li, Wen
author_sort	Li, Wen
title	Visual recognition by learning from web data
title_short	Visual recognition by learning from web data
title_full	Visual recognition by learning from web data
title_fullStr	Visual recognition by learning from web data
title_full_unstemmed	Visual recognition by learning from web data
title_sort	visual recognition by learning from web data
publishDate	2015
url	https://hdl.handle.net/10356/62186
_version_	1759854463407882240

Visual recognition by learning from web data

Similar Items