Few-shot visual understanding with deep neural networks

Deep Neural Networks (DNNs) have become indispensable for a variety of computer vision tasks, such as image recognition, image segmentation, and object detection. The availability of large-scale labeled datasets and the powerful fitting capability of deep models are two crucial factors that contribu...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang, Chi
Other Authors: Lin Guosheng
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/154696
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-154696
record_format dspace
spelling sg-ntu-dr.10356-1546962022-02-02T08:01:56Z Few-shot visual understanding with deep neural networks Zhang, Chi Lin Guosheng School of Computer Science and Engineering gslin@ntu.edu.sg Engineering::Computer science and engineering Deep Neural Networks (DNNs) have become indispensable for a variety of computer vision tasks, such as image recognition, image segmentation, and object detection. The availability of large-scale labeled datasets and the powerful fitting capability of deep models are two crucial factors that contribute to its success. However, models trained under fully supervised learning have many limitations that hinder their applications in real-world scenarios. For example, a trained CNN model can only apply to a set of pre-defined classes and it needs a large amount of labeled data to fine-tune a model for tasks on new categories. Moreover, data labelling can be very expensive for some vision tasks, such as image segmentation. Few-shot learning is proposed as a promising direction to alleviate the need for exhaustively labeled data by exploring a learning case where only a few labeled data is available to undertake a novel task based on prior knowledge learned on previous tasks. Typical application scenarios are few-shot image classification and few-shot image segmentation. In this thesis, we propose some novel algorithms to address the few-shot learning problems on image recognition and image segmentation tasks. In detail, for the task of few-shot image segmentation, we solve it as a message-passing problem that aims to extract useful information from the limited training data for the predictions on test images. We present two frameworks to achieve this goal. One employs the idea of prototype learning by representing the message in a category as a class-specific prototype, and the other method employs graph models to conduct message passing between local regions. For few-shot classification, we propose an algorithm that utilizes local representations in the images and structured distance to determine the image similarity for classification. To further solve the limitation in current few-shot learning methods that different few-shot learning algorithms often excel at different few-shot learning scenarios, we proffer to automate the selection from various few-shot learning designs and present a searching-based framework, which is inspired by the recent success in Automated Machine Learning literature (AutoML). Doctor of Philosophy 2022-01-06T01:57:26Z 2022-01-06T01:57:26Z 2021 Thesis-Doctor of Philosophy Zhang, C. (2021). Few-shot visual understanding with deep neural networks. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/154696 https://hdl.handle.net/10356/154696 10.32657/10356/154696 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Zhang, Chi
Few-shot visual understanding with deep neural networks
description Deep Neural Networks (DNNs) have become indispensable for a variety of computer vision tasks, such as image recognition, image segmentation, and object detection. The availability of large-scale labeled datasets and the powerful fitting capability of deep models are two crucial factors that contribute to its success. However, models trained under fully supervised learning have many limitations that hinder their applications in real-world scenarios. For example, a trained CNN model can only apply to a set of pre-defined classes and it needs a large amount of labeled data to fine-tune a model for tasks on new categories. Moreover, data labelling can be very expensive for some vision tasks, such as image segmentation. Few-shot learning is proposed as a promising direction to alleviate the need for exhaustively labeled data by exploring a learning case where only a few labeled data is available to undertake a novel task based on prior knowledge learned on previous tasks. Typical application scenarios are few-shot image classification and few-shot image segmentation. In this thesis, we propose some novel algorithms to address the few-shot learning problems on image recognition and image segmentation tasks. In detail, for the task of few-shot image segmentation, we solve it as a message-passing problem that aims to extract useful information from the limited training data for the predictions on test images. We present two frameworks to achieve this goal. One employs the idea of prototype learning by representing the message in a category as a class-specific prototype, and the other method employs graph models to conduct message passing between local regions. For few-shot classification, we propose an algorithm that utilizes local representations in the images and structured distance to determine the image similarity for classification. To further solve the limitation in current few-shot learning methods that different few-shot learning algorithms often excel at different few-shot learning scenarios, we proffer to automate the selection from various few-shot learning designs and present a searching-based framework, which is inspired by the recent success in Automated Machine Learning literature (AutoML).
author2 Lin Guosheng
author_facet Lin Guosheng
Zhang, Chi
format Thesis-Doctor of Philosophy
author Zhang, Chi
author_sort Zhang, Chi
title Few-shot visual understanding with deep neural networks
title_short Few-shot visual understanding with deep neural networks
title_full Few-shot visual understanding with deep neural networks
title_fullStr Few-shot visual understanding with deep neural networks
title_full_unstemmed Few-shot visual understanding with deep neural networks
title_sort few-shot visual understanding with deep neural networks
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/154696
_version_ 1724626838148349952