Efficient learning methods for high dimensional visual data

High dimensional visual data, derived from images or videos, is ubiquitous as advance camera technologies enable more measurements per sample to be captured. Increasingly sophisticated visual data representations further contribute to an increase in data dimensionality. To facilitate high level visu...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Marcus Caixing
Other Authors: Cham Tat Jen
Format: Theses and Dissertations
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/65624
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-65624
record_format dspace
spelling sg-ntu-dr.10356-656242023-03-04T00:38:22Z Efficient learning methods for high dimensional visual data Chen, Marcus Caixing Cham Tat Jen School of Computer Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition High dimensional visual data, derived from images or videos, is ubiquitous as advance camera technologies enable more measurements per sample to be captured. Increasingly sophisticated visual data representations further contribute to an increase in data dimensionality. To facilitate high level visual analytic tasks, this thesis focuses on three areas of high dimensional data processing, namely direct graph embedding for sample class or cluster prediction, video tracking for temporal information extraction, and spatial segmentation for a compact representation of high resolution images. To address the challenges of irrelevant, noisy, and highly correlational dimensions, a novel unified framework is proposed to simultaneously perform graph embedding and feature selection. This framework enables an efficient extraction of linear data intrinsic structures, which are low dimensional and robust to both noisiness in dimensions and outlier samples. This framework is computationally efficient and flexible to incorporate various data prior properties such as smoothness, sparsity, and locality. In video analysis, efficient learning of high dimensional visual data often requires modeling of temporal evolution of object appearance and motion. Instead of analyzing all the visual data, object level temporal information can be extracted via visual tracking for more efficient learning. For a long video sequence, the object appearance will change due to variations in its poses and orientation, illumination, and occlusion. To track both the object appearance and position, it is necessary to have a robust tracker with an adaptive object appearance update. We propose a generative model to address the dual uncertainties in both the target positions and appearance simultaneously. A diffusion process on a Riemannian manifold allows a geodesic evolution of the target appearance. Spatially, object segmentation can significantly simplify visual learning by grouping many pixels into a meaningful representation. However useful, object segmentation remains unsolved. We propose a novel object co-segmentation framework to learn object segmentation using a large image set. By leveraging on good segmentation results on the simplest images, we can propagate this to more and more complex images. All the proposed methods enable efficient learning of high dimensional visual data. The proposed unified framework for simultaneous dimensionality reduction and feature selection provides an abstraction for many high dimensional learning methods in the literature. Upon this framework, more learning methods can be further developed. Both visual tracking and object segmentation are important steps for many complex visual applications such as visual recognition. The proposed methods enable robust and efficient exploitation of both temporal and spatial information in visual data. DOCTOR OF PHILOSOPHY (SCE) 2015-11-25T06:23:52Z 2015-11-25T06:23:52Z 2015 2015 Thesis Chen, M. C. (2015). Efficient learning methods for high dimensional visual data. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/65624 10.32657/10356/65624 en 172 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Chen, Marcus Caixing
Efficient learning methods for high dimensional visual data
description High dimensional visual data, derived from images or videos, is ubiquitous as advance camera technologies enable more measurements per sample to be captured. Increasingly sophisticated visual data representations further contribute to an increase in data dimensionality. To facilitate high level visual analytic tasks, this thesis focuses on three areas of high dimensional data processing, namely direct graph embedding for sample class or cluster prediction, video tracking for temporal information extraction, and spatial segmentation for a compact representation of high resolution images. To address the challenges of irrelevant, noisy, and highly correlational dimensions, a novel unified framework is proposed to simultaneously perform graph embedding and feature selection. This framework enables an efficient extraction of linear data intrinsic structures, which are low dimensional and robust to both noisiness in dimensions and outlier samples. This framework is computationally efficient and flexible to incorporate various data prior properties such as smoothness, sparsity, and locality. In video analysis, efficient learning of high dimensional visual data often requires modeling of temporal evolution of object appearance and motion. Instead of analyzing all the visual data, object level temporal information can be extracted via visual tracking for more efficient learning. For a long video sequence, the object appearance will change due to variations in its poses and orientation, illumination, and occlusion. To track both the object appearance and position, it is necessary to have a robust tracker with an adaptive object appearance update. We propose a generative model to address the dual uncertainties in both the target positions and appearance simultaneously. A diffusion process on a Riemannian manifold allows a geodesic evolution of the target appearance. Spatially, object segmentation can significantly simplify visual learning by grouping many pixels into a meaningful representation. However useful, object segmentation remains unsolved. We propose a novel object co-segmentation framework to learn object segmentation using a large image set. By leveraging on good segmentation results on the simplest images, we can propagate this to more and more complex images. All the proposed methods enable efficient learning of high dimensional visual data. The proposed unified framework for simultaneous dimensionality reduction and feature selection provides an abstraction for many high dimensional learning methods in the literature. Upon this framework, more learning methods can be further developed. Both visual tracking and object segmentation are important steps for many complex visual applications such as visual recognition. The proposed methods enable robust and efficient exploitation of both temporal and spatial information in visual data.
author2 Cham Tat Jen
author_facet Cham Tat Jen
Chen, Marcus Caixing
format Theses and Dissertations
author Chen, Marcus Caixing
author_sort Chen, Marcus Caixing
title Efficient learning methods for high dimensional visual data
title_short Efficient learning methods for high dimensional visual data
title_full Efficient learning methods for high dimensional visual data
title_fullStr Efficient learning methods for high dimensional visual data
title_full_unstemmed Efficient learning methods for high dimensional visual data
title_sort efficient learning methods for high dimensional visual data
publishDate 2015
url https://hdl.handle.net/10356/65624
_version_ 1759854536053227520