Few-shot visual understanding with deep neural networks
Deep Neural Networks (DNNs) have become indispensable for a variety of computer vision tasks, such as image recognition, image segmentation, and object detection. The availability of large-scale labeled datasets and the powerful fitting capability of deep models are two crucial factors that contribu...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/154696 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-154696 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1546962022-02-02T08:01:56Z Few-shot visual understanding with deep neural networks Zhang, Chi Lin Guosheng School of Computer Science and Engineering gslin@ntu.edu.sg Engineering::Computer science and engineering Deep Neural Networks (DNNs) have become indispensable for a variety of computer vision tasks, such as image recognition, image segmentation, and object detection. The availability of large-scale labeled datasets and the powerful fitting capability of deep models are two crucial factors that contribute to its success. However, models trained under fully supervised learning have many limitations that hinder their applications in real-world scenarios. For example, a trained CNN model can only apply to a set of pre-defined classes and it needs a large amount of labeled data to fine-tune a model for tasks on new categories. Moreover, data labelling can be very expensive for some vision tasks, such as image segmentation. Few-shot learning is proposed as a promising direction to alleviate the need for exhaustively labeled data by exploring a learning case where only a few labeled data is available to undertake a novel task based on prior knowledge learned on previous tasks. Typical application scenarios are few-shot image classification and few-shot image segmentation. In this thesis, we propose some novel algorithms to address the few-shot learning problems on image recognition and image segmentation tasks. In detail, for the task of few-shot image segmentation, we solve it as a message-passing problem that aims to extract useful information from the limited training data for the predictions on test images. We present two frameworks to achieve this goal. One employs the idea of prototype learning by representing the message in a category as a class-specific prototype, and the other method employs graph models to conduct message passing between local regions. For few-shot classification, we propose an algorithm that utilizes local representations in the images and structured distance to determine the image similarity for classification. To further solve the limitation in current few-shot learning methods that different few-shot learning algorithms often excel at different few-shot learning scenarios, we proffer to automate the selection from various few-shot learning designs and present a searching-based framework, which is inspired by the recent success in Automated Machine Learning literature (AutoML). Doctor of Philosophy 2022-01-06T01:57:26Z 2022-01-06T01:57:26Z 2021 Thesis-Doctor of Philosophy Zhang, C. (2021). Few-shot visual understanding with deep neural networks. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/154696 https://hdl.handle.net/10356/154696 10.32657/10356/154696 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering |
spellingShingle |
Engineering::Computer science and engineering Zhang, Chi Few-shot visual understanding with deep neural networks |
description |
Deep Neural Networks (DNNs) have become indispensable for a variety of computer vision tasks, such as image recognition, image segmentation, and object detection. The availability of large-scale labeled datasets and the powerful fitting capability of deep models are two crucial factors that contribute to its success. However, models trained under fully supervised learning have many limitations that hinder their applications in real-world scenarios. For example, a trained CNN model can only apply to a set of pre-defined classes and it needs a large amount of labeled data to fine-tune a model for tasks on new categories. Moreover, data labelling can be very expensive for some vision tasks, such as image segmentation. Few-shot learning is proposed as a promising direction to alleviate the need for exhaustively labeled data by exploring a learning case where only a few labeled data is available to undertake a novel task based on prior knowledge learned on previous tasks. Typical application scenarios are few-shot image classification and few-shot image segmentation.
In this thesis, we propose some novel algorithms to address the few-shot learning problems on image recognition and image segmentation tasks. In detail, for the task of few-shot image segmentation, we solve it as a message-passing problem that aims to extract useful information from the limited training data for the predictions on test images. We present two frameworks to achieve this goal. One employs the idea of prototype learning by representing the message in a category as a class-specific prototype, and the other method employs graph models to conduct message passing between local regions. For few-shot classification, we propose an algorithm that utilizes local representations in the images and structured distance to determine the image similarity for classification. To further solve the limitation in current few-shot learning methods that different few-shot learning algorithms often excel at different few-shot learning scenarios, we proffer to automate the selection from various few-shot learning designs and present a searching-based framework, which is inspired by the recent success in Automated Machine Learning literature (AutoML). |
author2 |
Lin Guosheng |
author_facet |
Lin Guosheng Zhang, Chi |
format |
Thesis-Doctor of Philosophy |
author |
Zhang, Chi |
author_sort |
Zhang, Chi |
title |
Few-shot visual understanding with deep neural networks |
title_short |
Few-shot visual understanding with deep neural networks |
title_full |
Few-shot visual understanding with deep neural networks |
title_fullStr |
Few-shot visual understanding with deep neural networks |
title_full_unstemmed |
Few-shot visual understanding with deep neural networks |
title_sort |
few-shot visual understanding with deep neural networks |
publisher |
Nanyang Technological University |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/154696 |
_version_ |
1724626838148349952 |