Learning with multiple representations : algorithms and applications

Recently, lots of visual representations have been developed for computer vision applications. As different types of visual representations may reflect different kinds of information about the original data, their differentiation ability may vary greatly. As the existing machine learning algorithms...

Full description

Saved in:
Bibliographic Details
Main Author: Xu, Xinxing
Other Authors: Xu Dong
Format: Theses and Dissertations
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/62188
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-62188
record_format dspace
spelling sg-ntu-dr.10356-621882023-03-04T00:46:21Z Learning with multiple representations : algorithms and applications Xu, Xinxing Xu Dong School of Computer Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition Recently, lots of visual representations have been developed for computer vision applications. As different types of visual representations may reflect different kinds of information about the original data, their differentiation ability may vary greatly. As the existing machine learning algorithms are mostly based on the single data representation, it becomes more and more important to develop machine learning algorithms for tackling data with multiple representations. Therefore, in this thesis we study the problem of learning with multiple representations. We develop several novel algorithms to tackle data with multiple representations under three different learning scenarios, and we apply the proposed algorithms to a few computer vision applications. Specifically, we first study the learning with multiple kernels under fully supervised setting. Based on a hard margin perspective for the dual form of the traditional ℓ1-norm Multiple Kernel Learning (MKL), we introduce a new “kernel slack variable” and propose a Soft Margin framework for Multiple Kernel Learning (SMMKL). By incorporating the hinge loss for kernel slack variables, a new box constraint for the kernel coefficients is introduced for Multiple Kernel Learning. The square hinge loss and the square loss soft margin MKLs naturally incorporate the family of elastic-net MKL and ℓ2MKL, respectively. We demonstrate the effectiveness of our proposed algorithms on benchmark data sets as well as several computer vision data sets. Second, we study the learning with multiple kernels for weakly labeled data. Based on “input-output kernels”, we propose a unified Input-output Kernel Learning (IOKL) framework for handling weakly labeled data with multiple representations. Under this framework, the general data ambiguity problems such as SSL, MIL and clustering with multiple representations are solved in a unified framework. We formulate the learning problem as a group sparse MKL problem to incorporate the intrinsic group structure among the input-output kernels. A group sparse soft margin regularization is further developed to improve the performance. The promising experimental results on the challenging NUS-WIDE dataset for a computer vision application (i.e., text-based image retrieval), SSL benchmark datasets and MIL benchmark datasets demonstrate the effectiveness of our proposed IOKL framework. Third, we study the learning with privileged information for distance metric learning, where the distance metric is learnt with extra privileged information which is available only in the training data but unavailable in the test data. We propose a novel method called Information-theoretic Metric Learning with Privileged Information (ITML+) to model the learning scenario. An efficient cyclical projection method based on analytical solutions for all the variables is also developed to solve the new objective function. The proposed algorithm is applied to face verification and person re-identification in RGB images by learning from the RGB-D data. The extensive experiments are conducted on the real-world EUROCOM, CurtinFaces and BIWI RGBD-ID datasets and the results demonstrate the effectiveness of our newly proposed ITML+ algorithm. DOCTOR OF PHILOSOPHY (SCE) 2015-02-25T03:24:48Z 2015-02-25T03:24:48Z 2014 2014 Thesis Xu, X. (2014). Learning with multiple representations : algorithms and applications. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/62188 10.32657/10356/62188 en 170 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Xu, Xinxing
Learning with multiple representations : algorithms and applications
description Recently, lots of visual representations have been developed for computer vision applications. As different types of visual representations may reflect different kinds of information about the original data, their differentiation ability may vary greatly. As the existing machine learning algorithms are mostly based on the single data representation, it becomes more and more important to develop machine learning algorithms for tackling data with multiple representations. Therefore, in this thesis we study the problem of learning with multiple representations. We develop several novel algorithms to tackle data with multiple representations under three different learning scenarios, and we apply the proposed algorithms to a few computer vision applications. Specifically, we first study the learning with multiple kernels under fully supervised setting. Based on a hard margin perspective for the dual form of the traditional ℓ1-norm Multiple Kernel Learning (MKL), we introduce a new “kernel slack variable” and propose a Soft Margin framework for Multiple Kernel Learning (SMMKL). By incorporating the hinge loss for kernel slack variables, a new box constraint for the kernel coefficients is introduced for Multiple Kernel Learning. The square hinge loss and the square loss soft margin MKLs naturally incorporate the family of elastic-net MKL and ℓ2MKL, respectively. We demonstrate the effectiveness of our proposed algorithms on benchmark data sets as well as several computer vision data sets. Second, we study the learning with multiple kernels for weakly labeled data. Based on “input-output kernels”, we propose a unified Input-output Kernel Learning (IOKL) framework for handling weakly labeled data with multiple representations. Under this framework, the general data ambiguity problems such as SSL, MIL and clustering with multiple representations are solved in a unified framework. We formulate the learning problem as a group sparse MKL problem to incorporate the intrinsic group structure among the input-output kernels. A group sparse soft margin regularization is further developed to improve the performance. The promising experimental results on the challenging NUS-WIDE dataset for a computer vision application (i.e., text-based image retrieval), SSL benchmark datasets and MIL benchmark datasets demonstrate the effectiveness of our proposed IOKL framework. Third, we study the learning with privileged information for distance metric learning, where the distance metric is learnt with extra privileged information which is available only in the training data but unavailable in the test data. We propose a novel method called Information-theoretic Metric Learning with Privileged Information (ITML+) to model the learning scenario. An efficient cyclical projection method based on analytical solutions for all the variables is also developed to solve the new objective function. The proposed algorithm is applied to face verification and person re-identification in RGB images by learning from the RGB-D data. The extensive experiments are conducted on the real-world EUROCOM, CurtinFaces and BIWI RGBD-ID datasets and the results demonstrate the effectiveness of our newly proposed ITML+ algorithm.
author2 Xu Dong
author_facet Xu Dong
Xu, Xinxing
format Theses and Dissertations
author Xu, Xinxing
author_sort Xu, Xinxing
title Learning with multiple representations : algorithms and applications
title_short Learning with multiple representations : algorithms and applications
title_full Learning with multiple representations : algorithms and applications
title_fullStr Learning with multiple representations : algorithms and applications
title_full_unstemmed Learning with multiple representations : algorithms and applications
title_sort learning with multiple representations : algorithms and applications
publishDate 2015
url https://hdl.handle.net/10356/62188
_version_ 1759856852253802496