Image classification with limited data information
Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image d...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/174167 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-174167 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering Image classification Few-shot learning |
spellingShingle |
Engineering Image classification Few-shot learning Cheng, Hao Image classification with limited data information |
description |
Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image data, such as noisy data or image data with variations in object appearance, as encountered in image-set classification, and (II) limited availability of image data, including scarce samples, for example, in weakly supervised classification, or restricted availability of labeled data, as seen in few-shot image classification. These tasks require generic and highly flexible models, but also able to avoid over-fitting and failure to generalize when only a few samples are available. This thesis presents three works to tackle image classification tasks with limited data information in weakly supervised and few-shot learning.
We begin our study with small-scale visual classification tasks. From a traditional model-based perspective, we introduce a novel method called Joint Statistical and Spatial Sparse (J3S) representation, which reconciles local spatial patch structures and global statistical Gaussian distribution with joint sparsity. Integrating global Gaussian statistical and local spatial patch information through J3S with two joint dictionaries yields more accurate and robust results compared to considering specific information alone. Moving beyond general small-scale classification tasks, then we extend our exploration to a specific task, few-shot image classification. Here, we propose two deep learning-based methods to tackle challenges arising from limited data information. Initially, we focus on Graph Neural Networks (GNN) and investigate the limitations of existing GNN methods for few-shot learning. To address over-fitting and over-smoothing issues observed in recent GNN approaches, we propose the Attentive GNN (AGNN) framework. AGNN incorporates a triple-attention mechanism, facilitating graph initialization, graph update, and correlation across graph layers. We provide both theoretical analysis and practical illustrations to showcase how the proposed modules enhance GNN scalability for few-shot tasks, thereby improving few-shot performance. Subsequently, we explore more generalized and challenging few-shot scenarios, encompassing few-shot domain generalization settings. To address feature distraction caused by class-irrelevant excursive features such as style, domain, and background in image data, we propose a novel Disentangled Feature Representation framework (DFR). DFR effectively removes irrelevant information for classification, thus enhancing performance with class-domain disentanglement. Furthermore, we reorganize a novel dataset called FS-DomainNet based on DomainNet, specifically for benchmarking few-shot domain generalization tasks.
The main contributions of this thesis are three folds. Firstly, we conduct a comprehensive study on image classification with limited data from both model-based and deep learning-based perspectives. Secondly, we propose three novel approaches that address various challenges caused by limited data information from different angles. Additionally, we also introduce the FS-DomainNet dataset, specifically designed for evaluating the performance of few-shot methods in more generalized and challenging real-life scenarios. Lastly, we validate the effectiveness of our proposed methods through extensive experiments on multiple benchmarks. Both qualitative and quantitative results demonstrate the improved performance of the proposed approaches for image classification with limited data. The contributions made in this study significantly advance the understanding and practical capabilities of image classification with limited data information and provide essential groundwork for future research in this domain. |
author2 |
Wen Bihan |
author_facet |
Wen Bihan Cheng, Hao |
format |
Thesis-Doctor of Philosophy |
author |
Cheng, Hao |
author_sort |
Cheng, Hao |
title |
Image classification with limited data information |
title_short |
Image classification with limited data information |
title_full |
Image classification with limited data information |
title_fullStr |
Image classification with limited data information |
title_full_unstemmed |
Image classification with limited data information |
title_sort |
image classification with limited data information |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/174167 |
_version_ |
1800916393753837568 |
spelling |
sg-ntu-dr.10356-1741672024-04-09T03:58:58Z Image classification with limited data information Cheng, Hao Wen Bihan School of Electrical and Electronic Engineering bihan.wen@ntu.edu.sg Engineering Image classification Few-shot learning Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image data, such as noisy data or image data with variations in object appearance, as encountered in image-set classification, and (II) limited availability of image data, including scarce samples, for example, in weakly supervised classification, or restricted availability of labeled data, as seen in few-shot image classification. These tasks require generic and highly flexible models, but also able to avoid over-fitting and failure to generalize when only a few samples are available. This thesis presents three works to tackle image classification tasks with limited data information in weakly supervised and few-shot learning. We begin our study with small-scale visual classification tasks. From a traditional model-based perspective, we introduce a novel method called Joint Statistical and Spatial Sparse (J3S) representation, which reconciles local spatial patch structures and global statistical Gaussian distribution with joint sparsity. Integrating global Gaussian statistical and local spatial patch information through J3S with two joint dictionaries yields more accurate and robust results compared to considering specific information alone. Moving beyond general small-scale classification tasks, then we extend our exploration to a specific task, few-shot image classification. Here, we propose two deep learning-based methods to tackle challenges arising from limited data information. Initially, we focus on Graph Neural Networks (GNN) and investigate the limitations of existing GNN methods for few-shot learning. To address over-fitting and over-smoothing issues observed in recent GNN approaches, we propose the Attentive GNN (AGNN) framework. AGNN incorporates a triple-attention mechanism, facilitating graph initialization, graph update, and correlation across graph layers. We provide both theoretical analysis and practical illustrations to showcase how the proposed modules enhance GNN scalability for few-shot tasks, thereby improving few-shot performance. Subsequently, we explore more generalized and challenging few-shot scenarios, encompassing few-shot domain generalization settings. To address feature distraction caused by class-irrelevant excursive features such as style, domain, and background in image data, we propose a novel Disentangled Feature Representation framework (DFR). DFR effectively removes irrelevant information for classification, thus enhancing performance with class-domain disentanglement. Furthermore, we reorganize a novel dataset called FS-DomainNet based on DomainNet, specifically for benchmarking few-shot domain generalization tasks. The main contributions of this thesis are three folds. Firstly, we conduct a comprehensive study on image classification with limited data from both model-based and deep learning-based perspectives. Secondly, we propose three novel approaches that address various challenges caused by limited data information from different angles. Additionally, we also introduce the FS-DomainNet dataset, specifically designed for evaluating the performance of few-shot methods in more generalized and challenging real-life scenarios. Lastly, we validate the effectiveness of our proposed methods through extensive experiments on multiple benchmarks. Both qualitative and quantitative results demonstrate the improved performance of the proposed approaches for image classification with limited data. The contributions made in this study significantly advance the understanding and practical capabilities of image classification with limited data information and provide essential groundwork for future research in this domain. Doctor of Philosophy 2024-03-18T10:45:01Z 2024-03-18T10:45:01Z 2023 Thesis-Doctor of Philosophy Cheng, H. (2023). Image classification with limited data information. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/174167 https://hdl.handle.net/10356/174167 10.32657/10356/174167 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |