Content-based image retrieval with statistical machine learning.

Content-based image retrieval (CBIR) has attracted intensive attention in the computer vision community during the last decades. Relevance feedback (RF) is a powerful tool to bridge the gap between low-level visual features and high-level semantic concepts in CBIR. Although many algorithms have obta...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang, Lining.
Other Authors: Wang Lipo
Format: Theses and Dissertations
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/54889
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-54889
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Zhang, Lining.
Content-based image retrieval with statistical machine learning.
description Content-based image retrieval (CBIR) has attracted intensive attention in the computer vision community during the last decades. Relevance feedback (RF) is a powerful tool to bridge the gap between low-level visual features and high-level semantic concepts in CBIR. Although many algorithms have obtained promising performance in various practical applications, CBIR is still an open research topic mainly due to the difficulties in bridging the semantic gap. In this thesis, we mainly focus on applying statistical machine learning techniques to maximize the potential of conventional RF methods to significantly improve the performance of CBIR. To alleviate the small-sized training data problem in conventional discriminant analysis based RF, i.e., biased discriminant analysis (BDA), a generalized BDA (GBDA) method is developed based on the differential scatter discriminant criterion (DSDC). By redesigning the between-class scatter matrix and integrating the locality preserving principle, GBDA can also avoid the Gaussian distribution assumption for the positive feedback samples and the overfitting problem in BDA. The new method can outperform BDA and its extensions significantly, as shown by a large number of empirical studies. To incorporate the asymmetric property of training data with conventional classification based RF, i.e., support vector machine (SVM)-based RF, a biased maximum margin analysis (BMMA) method is designed based on the graph embedding framework to separate the positive and negative feedback samples by a maximum margin in the reduced subspace. By introducing a Laplacian regularizer to BMMA, semi-supervised BMMA (SemiBMMA) is also proposed to utilize the information of unlabeled samples for SVM-based RF. Experiments on a real-world image database have demonstrated that the proposed scheme combined with SVM-based RF can better model the RF procedure and reduce the performance degradation caused by the asymmetric property of training data. To select the most informative samples for the user to label, a geometric optimum experimental design (GOED) method is proposed to select multiple representative samples in the database as the most informative ones. GOED can alleviate the small-sized training data problem by leveraging the geometric structure of unlabeled samples in the reproducing kernel Hilbert space (RKHS), and thus can further enhance the performance of image retrieval. By minimizing the expected average prediction variance on the test data, GOED has a clear geometric interpretation to select a set of the most representative samples in the database iteratively with the global optimum. Moreover, the new method is label-independent and can effectively avoid various potential problems caused by insufficient and inexactly labeled samples in RF. Extensive experiments on both synthetic datasets and a real-world image database have confirmed the advantages of GOED. To exploit the RF log data, conjunctive patches subspace learning (CPSL) with side information is developed. CPSL can directly learn a semantic concept subspace from the RF log data with a set of similar and dissimilar pairwise constraints without using any explicit class label information, and this is more practical and useful in many real-world applications. CPSL can be formulated as a constraint optimization problem, and an efficient algorithm is presented to solve this task with closed-form solutions. Moreover, the new method can also lean a distance metric but performs more effectively and efficiently when dealing with high-dimensional data. The effectiveness of CPSL in exploiting the RF log data to improve the performance of CBIR has been demonstrated by a large number of empirical studies.
author2 Wang Lipo
author_facet Wang Lipo
Zhang, Lining.
format Theses and Dissertations
author Zhang, Lining.
author_sort Zhang, Lining.
title Content-based image retrieval with statistical machine learning.
title_short Content-based image retrieval with statistical machine learning.
title_full Content-based image retrieval with statistical machine learning.
title_fullStr Content-based image retrieval with statistical machine learning.
title_full_unstemmed Content-based image retrieval with statistical machine learning.
title_sort content-based image retrieval with statistical machine learning.
publishDate 2013
url https://hdl.handle.net/10356/54889
_version_ 1772829133199376384
spelling sg-ntu-dr.10356-548892023-07-04T16:19:03Z Content-based image retrieval with statistical machine learning. Zhang, Lining. Wang Lipo School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Content-based image retrieval (CBIR) has attracted intensive attention in the computer vision community during the last decades. Relevance feedback (RF) is a powerful tool to bridge the gap between low-level visual features and high-level semantic concepts in CBIR. Although many algorithms have obtained promising performance in various practical applications, CBIR is still an open research topic mainly due to the difficulties in bridging the semantic gap. In this thesis, we mainly focus on applying statistical machine learning techniques to maximize the potential of conventional RF methods to significantly improve the performance of CBIR. To alleviate the small-sized training data problem in conventional discriminant analysis based RF, i.e., biased discriminant analysis (BDA), a generalized BDA (GBDA) method is developed based on the differential scatter discriminant criterion (DSDC). By redesigning the between-class scatter matrix and integrating the locality preserving principle, GBDA can also avoid the Gaussian distribution assumption for the positive feedback samples and the overfitting problem in BDA. The new method can outperform BDA and its extensions significantly, as shown by a large number of empirical studies. To incorporate the asymmetric property of training data with conventional classification based RF, i.e., support vector machine (SVM)-based RF, a biased maximum margin analysis (BMMA) method is designed based on the graph embedding framework to separate the positive and negative feedback samples by a maximum margin in the reduced subspace. By introducing a Laplacian regularizer to BMMA, semi-supervised BMMA (SemiBMMA) is also proposed to utilize the information of unlabeled samples for SVM-based RF. Experiments on a real-world image database have demonstrated that the proposed scheme combined with SVM-based RF can better model the RF procedure and reduce the performance degradation caused by the asymmetric property of training data. To select the most informative samples for the user to label, a geometric optimum experimental design (GOED) method is proposed to select multiple representative samples in the database as the most informative ones. GOED can alleviate the small-sized training data problem by leveraging the geometric structure of unlabeled samples in the reproducing kernel Hilbert space (RKHS), and thus can further enhance the performance of image retrieval. By minimizing the expected average prediction variance on the test data, GOED has a clear geometric interpretation to select a set of the most representative samples in the database iteratively with the global optimum. Moreover, the new method is label-independent and can effectively avoid various potential problems caused by insufficient and inexactly labeled samples in RF. Extensive experiments on both synthetic datasets and a real-world image database have confirmed the advantages of GOED. To exploit the RF log data, conjunctive patches subspace learning (CPSL) with side information is developed. CPSL can directly learn a semantic concept subspace from the RF log data with a set of similar and dissimilar pairwise constraints without using any explicit class label information, and this is more practical and useful in many real-world applications. CPSL can be formulated as a constraint optimization problem, and an efficient algorithm is presented to solve this task with closed-form solutions. Moreover, the new method can also lean a distance metric but performs more effectively and efficiently when dealing with high-dimensional data. The effectiveness of CPSL in exploiting the RF log data to improve the performance of CBIR has been demonstrated by a large number of empirical studies. DOCTOR OF PHILOSOPHY (EEE) 2013-10-22T08:06:19Z 2013-10-22T08:06:19Z 2013 2013 Thesis Zhang, L. (2013). Content-based image retrieval with statistical machine learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/54889 10.32657/10356/54889 en 204 p. application/pdf