Distance learning between image and class for object recognition

Object recognition is an active research topic in the computer vision community. Recently a novel Image-to-Class (I2C) distance has been proposed to handle this problem, which classifies images using a simple Naive-Bayes based nearest-neighbor (NBNN) classifier but provides surprisingly excellent pe...

全面介紹

Saved in:

書目詳細資料
主要作者:	Wang, Zhengxiang
其他作者:	Chia Liang Tien, Clement
格式:	Theses and Dissertations
語言:	English
出版:	2013
主題:	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
在線閱讀:	https://hdl.handle.net/10356/54819
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

id	sg-ntu-dr.10356-54819
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Wang, Zhengxiang Distance learning between image and class for object recognition
description	Object recognition is an active research topic in the computer vision community. Recently a novel Image-to-Class (I2C) distance has been proposed to handle this problem, which classifies images using a simple Naive-Bayes based nearest-neighbor (NBNN) classifier but provides surprisingly excellent performance. This new distance provides a novel direction that avoids feature quantization and shows better generalization capability than the traditional Image-to-Image (I2I) distance. However, the computation cost of calculating this distance is too expensive since its performance relies heavily on searching the nearest neighbor (NN) from a large number of training features, and the label information of the training data is not fully used, which limits its recognition performance. In this thesis, we aim to improve both the recognition performance and efficiency of this I2C distance as well as to extend its application field. First of all, we add a training phase to this distance for improving its recognition performance by learning a weighted I2C distance. A large margin optimization framework is proposed to learn the I2C distance function, which is modeled as a weighted combination of the distance from every local feature in an image to its NN in a candidate class. We learn these weights associated with local features in the training set by constraining the optimization such that the I2C distance from image to its belonging class should be less than that to any other class. To reduce the computation cost, we also propose two methods based on spatial division and hubness score to accelerate the NN search, which is able to largely reduce the on-line testing time while still preserving or even achieving a better classification accuracy. Secondly, we propose a distance metric learning method to further improve the performance of I2C distance by learning Per-Class Mahalanobis metrics. This Mahalanobis I2C distance is adaptive to different classes by combining with the learned metric for each class. These multiple Per-Class metrics are learned simultaneously by forming a convex optimization problem and solved by an efficient subgradient descent method. For efficiency and scalability to large-scale problems, we also show how to simplify the method to learn a diagonal matrix for each class. Thirdly, we extend the object recognition to the multi-label problem and propose a Class-to-Image (C2I) distance, which shows better performance than the I2C distance for multi-label image classification. However, since the number of local features in a class is huge compared to that in an image, the calculation of the C2I distance is more expensive than the one of I2C distance. Moreover, the label information of training images can be used to help select relevant local features for each class and further improve the recognition performance. Therefore, to make the C2I distance faster and perform better, we propose an optimization algorithm using L_1-norm regularization and large margin constraint to learn the C2I distance, which can not only reduce the number of local features in the class feature set, but also improve the performance of the C2I distance due to the use of label information. We also use this C2I distance for object localization, so that it can tell not only whether a candidate class appears in a test image, but also where it locates. With these three works, we are able to improve the recognition performance and efficiency of the I2C distance and make it applicable for the multi-label problem. Therefore, the learned distance between image and class would be more practical for real world object recognition applications.
author2	Chia Liang Tien, Clement
author_facet	Chia Liang Tien, Clement Wang, Zhengxiang
format	Theses and Dissertations
author	Wang, Zhengxiang
author_sort	Wang, Zhengxiang
title	Distance learning between image and class for object recognition
title_short	Distance learning between image and class for object recognition
title_full	Distance learning between image and class for object recognition
title_fullStr	Distance learning between image and class for object recognition
title_full_unstemmed	Distance learning between image and class for object recognition
title_sort	distance learning between image and class for object recognition
publishDate	2013
url	https://hdl.handle.net/10356/54819
_version_	1759856710050119680
spelling	sg-ntu-dr.10356-548192023-03-04T00:37:15Z Distance learning between image and class for object recognition Wang, Zhengxiang Chia Liang Tien, Clement School of Computer Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Object recognition is an active research topic in the computer vision community. Recently a novel Image-to-Class (I2C) distance has been proposed to handle this problem, which classifies images using a simple Naive-Bayes based nearest-neighbor (NBNN) classifier but provides surprisingly excellent performance. This new distance provides a novel direction that avoids feature quantization and shows better generalization capability than the traditional Image-to-Image (I2I) distance. However, the computation cost of calculating this distance is too expensive since its performance relies heavily on searching the nearest neighbor (NN) from a large number of training features, and the label information of the training data is not fully used, which limits its recognition performance. In this thesis, we aim to improve both the recognition performance and efficiency of this I2C distance as well as to extend its application field. First of all, we add a training phase to this distance for improving its recognition performance by learning a weighted I2C distance. A large margin optimization framework is proposed to learn the I2C distance function, which is modeled as a weighted combination of the distance from every local feature in an image to its NN in a candidate class. We learn these weights associated with local features in the training set by constraining the optimization such that the I2C distance from image to its belonging class should be less than that to any other class. To reduce the computation cost, we also propose two methods based on spatial division and hubness score to accelerate the NN search, which is able to largely reduce the on-line testing time while still preserving or even achieving a better classification accuracy. Secondly, we propose a distance metric learning method to further improve the performance of I2C distance by learning Per-Class Mahalanobis metrics. This Mahalanobis I2C distance is adaptive to different classes by combining with the learned metric for each class. These multiple Per-Class metrics are learned simultaneously by forming a convex optimization problem and solved by an efficient subgradient descent method. For efficiency and scalability to large-scale problems, we also show how to simplify the method to learn a diagonal matrix for each class. Thirdly, we extend the object recognition to the multi-label problem and propose a Class-to-Image (C2I) distance, which shows better performance than the I2C distance for multi-label image classification. However, since the number of local features in a class is huge compared to that in an image, the calculation of the C2I distance is more expensive than the one of I2C distance. Moreover, the label information of training images can be used to help select relevant local features for each class and further improve the recognition performance. Therefore, to make the C2I distance faster and perform better, we propose an optimization algorithm using L_1-norm regularization and large margin constraint to learn the C2I distance, which can not only reduce the number of local features in the class feature set, but also improve the performance of the C2I distance due to the use of label information. We also use this C2I distance for object localization, so that it can tell not only whether a candidate class appears in a test image, but also where it locates. With these three works, we are able to improve the recognition performance and efficiency of the I2C distance and make it applicable for the multi-label problem. Therefore, the learned distance between image and class would be more practical for real world object recognition applications. DOCTOR OF PHILOSOPHY (SCE) 2013-08-30T01:59:50Z 2013-08-30T01:59:50Z 2013 2013 Thesis Wang, Z. (2013). Distance learning between image and class for object recognition. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/54819 10.32657/10356/54819 en 133 p. application/pdf

Distance learning between image and class for object recognition

相似書籍