Image classification with limited data information

Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image d...

Full description

Saved in:
Bibliographic Details
Main Author: Cheng, Hao
Other Authors: Wen Bihan
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/174167
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image data, such as noisy data or image data with variations in object appearance, as encountered in image-set classification, and (II) limited availability of image data, including scarce samples, for example, in weakly supervised classification, or restricted availability of labeled data, as seen in few-shot image classification. These tasks require generic and highly flexible models, but also able to avoid over-fitting and failure to generalize when only a few samples are available. This thesis presents three works to tackle image classification tasks with limited data information in weakly supervised and few-shot learning. We begin our study with small-scale visual classification tasks. From a traditional model-based perspective, we introduce a novel method called Joint Statistical and Spatial Sparse (J3S) representation, which reconciles local spatial patch structures and global statistical Gaussian distribution with joint sparsity. Integrating global Gaussian statistical and local spatial patch information through J3S with two joint dictionaries yields more accurate and robust results compared to considering specific information alone. Moving beyond general small-scale classification tasks, then we extend our exploration to a specific task, few-shot image classification. Here, we propose two deep learning-based methods to tackle challenges arising from limited data information. Initially, we focus on Graph Neural Networks (GNN) and investigate the limitations of existing GNN methods for few-shot learning. To address over-fitting and over-smoothing issues observed in recent GNN approaches, we propose the Attentive GNN (AGNN) framework. AGNN incorporates a triple-attention mechanism, facilitating graph initialization, graph update, and correlation across graph layers. We provide both theoretical analysis and practical illustrations to showcase how the proposed modules enhance GNN scalability for few-shot tasks, thereby improving few-shot performance. Subsequently, we explore more generalized and challenging few-shot scenarios, encompassing few-shot domain generalization settings. To address feature distraction caused by class-irrelevant excursive features such as style, domain, and background in image data, we propose a novel Disentangled Feature Representation framework (DFR). DFR effectively removes irrelevant information for classification, thus enhancing performance with class-domain disentanglement. Furthermore, we reorganize a novel dataset called FS-DomainNet based on DomainNet, specifically for benchmarking few-shot domain generalization tasks. The main contributions of this thesis are three folds. Firstly, we conduct a comprehensive study on image classification with limited data from both model-based and deep learning-based perspectives. Secondly, we propose three novel approaches that address various challenges caused by limited data information from different angles. Additionally, we also introduce the FS-DomainNet dataset, specifically designed for evaluating the performance of few-shot methods in more generalized and challenging real-life scenarios. Lastly, we validate the effectiveness of our proposed methods through extensive experiments on multiple benchmarks. Both qualitative and quantitative results demonstrate the improved performance of the proposed approaches for image classification with limited data. The contributions made in this study significantly advance the understanding and practical capabilities of image classification with limited data information and provide essential groundwork for future research in this domain.