Image classification with limited data information

Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image d...

Full description

Saved in:

Bibliographic Details
Main Author:	Cheng, Hao
Other Authors:	Wen Bihan
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Engineering Image classification Few-shot learning
Online Access:	https://hdl.handle.net/10356/174167
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-174167
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering Image classification Few-shot learning
spellingShingle	Engineering Image classification Few-shot learning Cheng, Hao Image classification with limited data information
description	Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image data, such as noisy data or image data with variations in object appearance, as encountered in image-set classification, and (II) limited availability of image data, including scarce samples, for example, in weakly supervised classification, or restricted availability of labeled data, as seen in few-shot image classification. These tasks require generic and highly flexible models, but also able to avoid over-fitting and failure to generalize when only a few samples are available. This thesis presents three works to tackle image classification tasks with limited data information in weakly supervised and few-shot learning. We begin our study with small-scale visual classification tasks. From a traditional model-based perspective, we introduce a novel method called Joint Statistical and Spatial Sparse (J3S) representation, which reconciles local spatial patch structures and global statistical Gaussian distribution with joint sparsity. Integrating global Gaussian statistical and local spatial patch information through J3S with two joint dictionaries yields more accurate and robust results compared to considering specific information alone. Moving beyond general small-scale classification tasks, then we extend our exploration to a specific task, few-shot image classification. Here, we propose two deep learning-based methods to tackle challenges arising from limited data information. Initially, we focus on Graph Neural Networks (GNN) and investigate the limitations of existing GNN methods for few-shot learning. To address over-fitting and over-smoothing issues observed in recent GNN approaches, we propose the Attentive GNN (AGNN) framework. AGNN incorporates a triple-attention mechanism, facilitating graph initialization, graph update, and correlation across graph layers. We provide both theoretical analysis and practical illustrations to showcase how the proposed modules enhance GNN scalability for few-shot tasks, thereby improving few-shot performance. Subsequently, we explore more generalized and challenging few-shot scenarios, encompassing few-shot domain generalization settings. To address feature distraction caused by class-irrelevant excursive features such as style, domain, and background in image data, we propose a novel Disentangled Feature Representation framework (DFR). DFR effectively removes irrelevant information for classification, thus enhancing performance with class-domain disentanglement. Furthermore, we reorganize a novel dataset called FS-DomainNet based on DomainNet, specifically for benchmarking few-shot domain generalization tasks. The main contributions of this thesis are three folds. Firstly, we conduct a comprehensive study on image classification with limited data from both model-based and deep learning-based perspectives. Secondly, we propose three novel approaches that address various challenges caused by limited data information from different angles. Additionally, we also introduce the FS-DomainNet dataset, specifically designed for evaluating the performance of few-shot methods in more generalized and challenging real-life scenarios. Lastly, we validate the effectiveness of our proposed methods through extensive experiments on multiple benchmarks. Both qualitative and quantitative results demonstrate the improved performance of the proposed approaches for image classification with limited data. The contributions made in this study significantly advance the understanding and practical capabilities of image classification with limited data information and provide essential groundwork for future research in this domain.
author2	Wen Bihan
author_facet	Wen Bihan Cheng, Hao
format	Thesis-Doctor of Philosophy
author	Cheng, Hao
author_sort	Cheng, Hao
title	Image classification with limited data information
title_short	Image classification with limited data information
title_full	Image classification with limited data information
title_fullStr	Image classification with limited data information
title_full_unstemmed	Image classification with limited data information
title_sort	image classification with limited data information
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/174167
_version_	1800916393753837568
spelling	sg-ntu-dr.10356-1741672024-04-09T03:58:58Z Image classification with limited data information Cheng, Hao Wen Bihan School of Electrical and Electronic Engineering bihan.wen@ntu.edu.sg Engineering Image classification Few-shot learning Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image data, such as noisy data or image data with variations in object appearance, as encountered in image-set classification, and (II) limited availability of image data, including scarce samples, for example, in weakly supervised classification, or restricted availability of labeled data, as seen in few-shot image classification. These tasks require generic and highly flexible models, but also able to avoid over-fitting and failure to generalize when only a few samples are available. This thesis presents three works to tackle image classification tasks with limited data information in weakly supervised and few-shot learning. We begin our study with small-scale visual classification tasks. From a traditional model-based perspective, we introduce a novel method called Joint Statistical and Spatial Sparse (J3S) representation, which reconciles local spatial patch structures and global statistical Gaussian distribution with joint sparsity. Integrating global Gaussian statistical and local spatial patch information through J3S with two joint dictionaries yields more accurate and robust results compared to considering specific information alone. Moving beyond general small-scale classification tasks, then we extend our exploration to a specific task, few-shot image classification. Here, we propose two deep learning-based methods to tackle challenges arising from limited data information. Initially, we focus on Graph Neural Networks (GNN) and investigate the limitations of existing GNN methods for few-shot learning. To address over-fitting and over-smoothing issues observed in recent GNN approaches, we propose the Attentive GNN (AGNN) framework. AGNN incorporates a triple-attention mechanism, facilitating graph initialization, graph update, and correlation across graph layers. We provide both theoretical analysis and practical illustrations to showcase how the proposed modules enhance GNN scalability for few-shot tasks, thereby improving few-shot performance. Subsequently, we explore more generalized and challenging few-shot scenarios, encompassing few-shot domain generalization settings. To address feature distraction caused by class-irrelevant excursive features such as style, domain, and background in image data, we propose a novel Disentangled Feature Representation framework (DFR). DFR effectively removes irrelevant information for classification, thus enhancing performance with class-domain disentanglement. Furthermore, we reorganize a novel dataset called FS-DomainNet based on DomainNet, specifically for benchmarking few-shot domain generalization tasks. The main contributions of this thesis are three folds. Firstly, we conduct a comprehensive study on image classification with limited data from both model-based and deep learning-based perspectives. Secondly, we propose three novel approaches that address various challenges caused by limited data information from different angles. Additionally, we also introduce the FS-DomainNet dataset, specifically designed for evaluating the performance of few-shot methods in more generalized and challenging real-life scenarios. Lastly, we validate the effectiveness of our proposed methods through extensive experiments on multiple benchmarks. Both qualitative and quantitative results demonstrate the improved performance of the proposed approaches for image classification with limited data. The contributions made in this study significantly advance the understanding and practical capabilities of image classification with limited data information and provide essential groundwork for future research in this domain. Doctor of Philosophy 2024-03-18T10:45:01Z 2024-03-18T10:45:01Z 2023 Thesis-Doctor of Philosophy Cheng, H. (2023). Image classification with limited data information. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/174167 https://hdl.handle.net/10356/174167 10.32657/10356/174167 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Image classification with limited data information

Similar Items