Few-shot visual understanding with deep neural networks

Deep Neural Networks (DNNs) have become indispensable for a variety of computer vision tasks, such as image recognition, image segmentation, and object detection. The availability of large-scale labeled datasets and the powerful fitting capability of deep models are two crucial factors that contribu...

Full description

Saved in:

Bibliographic Details
Main Author:	Zhang, Chi
Other Authors:	Lin Guosheng
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2022
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/154696
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-154696
record_format	dspace
spelling	sg-ntu-dr.10356-1546962022-02-02T08:01:56Z Few-shot visual understanding with deep neural networks Zhang, Chi Lin Guosheng School of Computer Science and Engineering gslin@ntu.edu.sg Engineering::Computer science and engineering Deep Neural Networks (DNNs) have become indispensable for a variety of computer vision tasks, such as image recognition, image segmentation, and object detection. The availability of large-scale labeled datasets and the powerful fitting capability of deep models are two crucial factors that contribute to its success. However, models trained under fully supervised learning have many limitations that hinder their applications in real-world scenarios. For example, a trained CNN model can only apply to a set of pre-defined classes and it needs a large amount of labeled data to fine-tune a model for tasks on new categories. Moreover, data labelling can be very expensive for some vision tasks, such as image segmentation. Few-shot learning is proposed as a promising direction to alleviate the need for exhaustively labeled data by exploring a learning case where only a few labeled data is available to undertake a novel task based on prior knowledge learned on previous tasks. Typical application scenarios are few-shot image classification and few-shot image segmentation. In this thesis, we propose some novel algorithms to address the few-shot learning problems on image recognition and image segmentation tasks. In detail, for the task of few-shot image segmentation, we solve it as a message-passing problem that aims to extract useful information from the limited training data for the predictions on test images. We present two frameworks to achieve this goal. One employs the idea of prototype learning by representing the message in a category as a class-specific prototype, and the other method employs graph models to conduct message passing between local regions. For few-shot classification, we propose an algorithm that utilizes local representations in the images and structured distance to determine the image similarity for classification. To further solve the limitation in current few-shot learning methods that different few-shot learning algorithms often excel at different few-shot learning scenarios, we proffer to automate the selection from various few-shot learning designs and present a searching-based framework, which is inspired by the recent success in Automated Machine Learning literature (AutoML). Doctor of Philosophy 2022-01-06T01:57:26Z 2022-01-06T01:57:26Z 2021 Thesis-Doctor of Philosophy Zhang, C. (2021). Few-shot visual understanding with deep neural networks. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/154696 https://hdl.handle.net/10356/154696 10.32657/10356/154696 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Zhang, Chi Few-shot visual understanding with deep neural networks
description	Deep Neural Networks (DNNs) have become indispensable for a variety of computer vision tasks, such as image recognition, image segmentation, and object detection. The availability of large-scale labeled datasets and the powerful fitting capability of deep models are two crucial factors that contribute to its success. However, models trained under fully supervised learning have many limitations that hinder their applications in real-world scenarios. For example, a trained CNN model can only apply to a set of pre-defined classes and it needs a large amount of labeled data to fine-tune a model for tasks on new categories. Moreover, data labelling can be very expensive for some vision tasks, such as image segmentation. Few-shot learning is proposed as a promising direction to alleviate the need for exhaustively labeled data by exploring a learning case where only a few labeled data is available to undertake a novel task based on prior knowledge learned on previous tasks. Typical application scenarios are few-shot image classification and few-shot image segmentation. In this thesis, we propose some novel algorithms to address the few-shot learning problems on image recognition and image segmentation tasks. In detail, for the task of few-shot image segmentation, we solve it as a message-passing problem that aims to extract useful information from the limited training data for the predictions on test images. We present two frameworks to achieve this goal. One employs the idea of prototype learning by representing the message in a category as a class-specific prototype, and the other method employs graph models to conduct message passing between local regions. For few-shot classification, we propose an algorithm that utilizes local representations in the images and structured distance to determine the image similarity for classification. To further solve the limitation in current few-shot learning methods that different few-shot learning algorithms often excel at different few-shot learning scenarios, we proffer to automate the selection from various few-shot learning designs and present a searching-based framework, which is inspired by the recent success in Automated Machine Learning literature (AutoML).
author2	Lin Guosheng
author_facet	Lin Guosheng Zhang, Chi
format	Thesis-Doctor of Philosophy
author	Zhang, Chi
author_sort	Zhang, Chi
title	Few-shot visual understanding with deep neural networks
title_short	Few-shot visual understanding with deep neural networks
title_full	Few-shot visual understanding with deep neural networks
title_fullStr	Few-shot visual understanding with deep neural networks
title_full_unstemmed	Few-shot visual understanding with deep neural networks
title_sort	few-shot visual understanding with deep neural networks
publisher	Nanyang Technological University
publishDate	2022
url	https://hdl.handle.net/10356/154696
_version_	1724626838148349952

Few-shot visual understanding with deep neural networks

Similar Items