Visual recognition by learning from web data

With the rapid development of digital cameras, we have witnessed an explosive growth of digital images. Every day, a tremendous amount of images together with rich contextual information (e.g., tags, categories and captions) are posted to Internet. There is an increasing interest in exploiting those...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Wen
Other Authors: Xu Dong
Format: Theses and Dissertations
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/62186
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-62186
record_format dspace
spelling sg-ntu-dr.10356-621862023-03-04T00:46:19Z Visual recognition by learning from web data Li, Wen Xu Dong School of Computer Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition With the rapid development of digital cameras, we have witnessed an explosive growth of digital images. Every day, a tremendous amount of images together with rich contextual information (e.g., tags, categories and captions) are posted to Internet. There is an increasing interest in exploiting those web images for building intelligent visual recognition systems. While some works have been proposed to collect large scale image datasets by crawling web images from Internet, considerable human efforts are still required to annotate those images to train classifiers for visual recognition. In this thesis, we propose to develop novel learning algorithms for visual recognition by learning from web data, in which we aim to use as less as possible human efforts for annotating the training data. First, considering that web images are usually associated with noisy surrounding textual descriptions, we treat the words in the surrounding text as weak labels and formulate the task of learning from web data as a multi-instance learning (MIL) problem. By observing the relevant images usually contain many true positive images, we generalize the traditional MIL constraints on positive bags to that each positive bag contains at least a portion of positive instances. To effectively exploit such constraints on positive bags, we develop a new MIL algorithm called MIL with constrained positive bags (MIL-CPB) for web image retrieval. Observing that the constraints are not always satisfied in the MIL-CPB, a progressive scheme is proposed to further improve the retrieval performance, in which we iteratively partition the top-ranked training web images from the current MIL-CPB classifier to construct more confident positive bags and then use these new bags as training data to learn the subsequent MIL-CPB classifiers. Second, when the web training data is represented with multiple views of features, we further propose a co-labeling approach to improve the classifiers learnt from web data by using multiple views of features. We model the learning problem on each view as a weakly labeled learning problem, and use the predicted training labels from the classifier trained on one view to help the classifier on another view. Our co-labeling approach not only can handle the traditional multi-view semi-supervised learning problem, but also can be applied to other multi-view weakly labeled learning problem such as multi-view MIL. Finally, we observe that there are intrinsic differences between the crawled web training data and the testing images in our daily lives, which is also known as the domain adaptation problem. Particularly, we study the heterogeneous domain adaptation problem, in which the samples in source and target domains are with different feature representations. We build upon the recent Heterogeneous Feature Augmentation (HFA) method, and propose a convex reformulation of HFA, which can guarantee the global optimal solution. We further extend the HFA method to semi-supervised HFA (SHFA), in which we improve the learnt classifiers by exploiting the additional unlabeled data from the target domain. For all our proposed approaches, we conduct extensive experiments on publicly available datasets to demonstrate their effectiveness. DOCTOR OF PHILOSOPHY (SCE) 2015-02-25T03:15:39Z 2015-02-25T03:15:39Z 2014 2014 Thesis Li, W. (2014). Visual recognition by learning from web data. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/62186 10.32657/10356/62186 en 152 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Li, Wen
Visual recognition by learning from web data
description With the rapid development of digital cameras, we have witnessed an explosive growth of digital images. Every day, a tremendous amount of images together with rich contextual information (e.g., tags, categories and captions) are posted to Internet. There is an increasing interest in exploiting those web images for building intelligent visual recognition systems. While some works have been proposed to collect large scale image datasets by crawling web images from Internet, considerable human efforts are still required to annotate those images to train classifiers for visual recognition. In this thesis, we propose to develop novel learning algorithms for visual recognition by learning from web data, in which we aim to use as less as possible human efforts for annotating the training data. First, considering that web images are usually associated with noisy surrounding textual descriptions, we treat the words in the surrounding text as weak labels and formulate the task of learning from web data as a multi-instance learning (MIL) problem. By observing the relevant images usually contain many true positive images, we generalize the traditional MIL constraints on positive bags to that each positive bag contains at least a portion of positive instances. To effectively exploit such constraints on positive bags, we develop a new MIL algorithm called MIL with constrained positive bags (MIL-CPB) for web image retrieval. Observing that the constraints are not always satisfied in the MIL-CPB, a progressive scheme is proposed to further improve the retrieval performance, in which we iteratively partition the top-ranked training web images from the current MIL-CPB classifier to construct more confident positive bags and then use these new bags as training data to learn the subsequent MIL-CPB classifiers. Second, when the web training data is represented with multiple views of features, we further propose a co-labeling approach to improve the classifiers learnt from web data by using multiple views of features. We model the learning problem on each view as a weakly labeled learning problem, and use the predicted training labels from the classifier trained on one view to help the classifier on another view. Our co-labeling approach not only can handle the traditional multi-view semi-supervised learning problem, but also can be applied to other multi-view weakly labeled learning problem such as multi-view MIL. Finally, we observe that there are intrinsic differences between the crawled web training data and the testing images in our daily lives, which is also known as the domain adaptation problem. Particularly, we study the heterogeneous domain adaptation problem, in which the samples in source and target domains are with different feature representations. We build upon the recent Heterogeneous Feature Augmentation (HFA) method, and propose a convex reformulation of HFA, which can guarantee the global optimal solution. We further extend the HFA method to semi-supervised HFA (SHFA), in which we improve the learnt classifiers by exploiting the additional unlabeled data from the target domain. For all our proposed approaches, we conduct extensive experiments on publicly available datasets to demonstrate their effectiveness.
author2 Xu Dong
author_facet Xu Dong
Li, Wen
format Theses and Dissertations
author Li, Wen
author_sort Li, Wen
title Visual recognition by learning from web data
title_short Visual recognition by learning from web data
title_full Visual recognition by learning from web data
title_fullStr Visual recognition by learning from web data
title_full_unstemmed Visual recognition by learning from web data
title_sort visual recognition by learning from web data
publishDate 2015
url https://hdl.handle.net/10356/62186
_version_ 1759854463407882240