Few-shot vision recognition and generation for the open-world

Deep Neural Networks (DNNs) have achieved remarkable success across various computer vision tasks, but their reliance on extensive labeled datasets limits their applicability in data-scarce scenarios. Few-shot learning offers a promising solution by enabling models to learn from minimal data, yet tr...

Full description

Saved in:
Bibliographic Details
Main Author: Song, Nan
Other Authors: Lin Guosheng
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181293
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181293
record_format dspace
spelling sg-ntu-dr.10356-1812932024-12-03T05:20:50Z Few-shot vision recognition and generation for the open-world Song, Nan Lin Guosheng College of Computing and Data Science gslin@ntu.edu.sg Computer and Information Science Few-shot learning Deep Neural Networks (DNNs) have achieved remarkable success across various computer vision tasks, but their reliance on extensive labeled datasets limits their applicability in data-scarce scenarios. Few-shot learning offers a promising solution by enabling models to learn from minimal data, yet traditional approaches assume a closed set of classes, which is impractical in open-world settings. This thesis addresses the challenges of few-shot learning in an open-world context by introducing three novel frameworks: Few-shot Open-set Recognition (FSOSR), Few-shot Class Incremental Learning (FSCIL), and Lifelong Few-shot Text-to-Image Diffusion. For FSOSR, we reserve space for unseen classes and leverage background features from seen classes as pseudo unseen classes to effectively learn decision boundaries. For FSCIL, we adopt a decoupled learning strategy that prevents knowledge forgetting by updating only classifiers during incremental sessions and introduce a Continually Evolved Classifier (CEC) using graph-based context propagation. In Lifelong Few-shot Text-to-Image Diffusion, we integrate data-free knowledge distillation and In-Context Generation (ICGen) to continuously generate high-quality images from limited examples while retaining prior knowledge. Extensive experiments on benchmark datasets demonstrate that these frameworks significantly improve adaptability and efficiency in dynamic environments, setting new state-of-the-art results. This thesis advances both theoretical and practical aspects of few-shot learning, enabling robust and scalable AI systems for real-world applications. Doctor of Philosophy 2024-11-24T23:54:37Z 2024-11-24T23:54:37Z 2024 Thesis-Doctor of Philosophy Song, N. (2024). Few-shot vision recognition and generation for the open-world. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181293 https://hdl.handle.net/10356/181293 10.32657/10356/181293 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Few-shot learning
spellingShingle Computer and Information Science
Few-shot learning
Song, Nan
Few-shot vision recognition and generation for the open-world
description Deep Neural Networks (DNNs) have achieved remarkable success across various computer vision tasks, but their reliance on extensive labeled datasets limits their applicability in data-scarce scenarios. Few-shot learning offers a promising solution by enabling models to learn from minimal data, yet traditional approaches assume a closed set of classes, which is impractical in open-world settings. This thesis addresses the challenges of few-shot learning in an open-world context by introducing three novel frameworks: Few-shot Open-set Recognition (FSOSR), Few-shot Class Incremental Learning (FSCIL), and Lifelong Few-shot Text-to-Image Diffusion. For FSOSR, we reserve space for unseen classes and leverage background features from seen classes as pseudo unseen classes to effectively learn decision boundaries. For FSCIL, we adopt a decoupled learning strategy that prevents knowledge forgetting by updating only classifiers during incremental sessions and introduce a Continually Evolved Classifier (CEC) using graph-based context propagation. In Lifelong Few-shot Text-to-Image Diffusion, we integrate data-free knowledge distillation and In-Context Generation (ICGen) to continuously generate high-quality images from limited examples while retaining prior knowledge. Extensive experiments on benchmark datasets demonstrate that these frameworks significantly improve adaptability and efficiency in dynamic environments, setting new state-of-the-art results. This thesis advances both theoretical and practical aspects of few-shot learning, enabling robust and scalable AI systems for real-world applications.
author2 Lin Guosheng
author_facet Lin Guosheng
Song, Nan
format Thesis-Doctor of Philosophy
author Song, Nan
author_sort Song, Nan
title Few-shot vision recognition and generation for the open-world
title_short Few-shot vision recognition and generation for the open-world
title_full Few-shot vision recognition and generation for the open-world
title_fullStr Few-shot vision recognition and generation for the open-world
title_full_unstemmed Few-shot vision recognition and generation for the open-world
title_sort few-shot vision recognition and generation for the open-world
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/181293
_version_ 1819112932356653056