Few-shot vision recognition and generation for the open-world
Deep Neural Networks (DNNs) have achieved remarkable success across various computer vision tasks, but their reliance on extensive labeled datasets limits their applicability in data-scarce scenarios. Few-shot learning offers a promising solution by enabling models to learn from minimal data, yet tr...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181293 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181293 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1812932024-12-03T05:20:50Z Few-shot vision recognition and generation for the open-world Song, Nan Lin Guosheng College of Computing and Data Science gslin@ntu.edu.sg Computer and Information Science Few-shot learning Deep Neural Networks (DNNs) have achieved remarkable success across various computer vision tasks, but their reliance on extensive labeled datasets limits their applicability in data-scarce scenarios. Few-shot learning offers a promising solution by enabling models to learn from minimal data, yet traditional approaches assume a closed set of classes, which is impractical in open-world settings. This thesis addresses the challenges of few-shot learning in an open-world context by introducing three novel frameworks: Few-shot Open-set Recognition (FSOSR), Few-shot Class Incremental Learning (FSCIL), and Lifelong Few-shot Text-to-Image Diffusion. For FSOSR, we reserve space for unseen classes and leverage background features from seen classes as pseudo unseen classes to effectively learn decision boundaries. For FSCIL, we adopt a decoupled learning strategy that prevents knowledge forgetting by updating only classifiers during incremental sessions and introduce a Continually Evolved Classifier (CEC) using graph-based context propagation. In Lifelong Few-shot Text-to-Image Diffusion, we integrate data-free knowledge distillation and In-Context Generation (ICGen) to continuously generate high-quality images from limited examples while retaining prior knowledge. Extensive experiments on benchmark datasets demonstrate that these frameworks significantly improve adaptability and efficiency in dynamic environments, setting new state-of-the-art results. This thesis advances both theoretical and practical aspects of few-shot learning, enabling robust and scalable AI systems for real-world applications. Doctor of Philosophy 2024-11-24T23:54:37Z 2024-11-24T23:54:37Z 2024 Thesis-Doctor of Philosophy Song, N. (2024). Few-shot vision recognition and generation for the open-world. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181293 https://hdl.handle.net/10356/181293 10.32657/10356/181293 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Few-shot learning |
spellingShingle |
Computer and Information Science Few-shot learning Song, Nan Few-shot vision recognition and generation for the open-world |
description |
Deep Neural Networks (DNNs) have achieved remarkable success across various computer vision tasks, but their reliance on extensive labeled datasets limits their applicability in data-scarce scenarios. Few-shot learning offers a promising solution by enabling models to learn from minimal data, yet traditional approaches assume a closed set of classes, which is impractical in open-world settings. This thesis addresses the challenges of few-shot learning in an open-world context by introducing three novel frameworks: Few-shot Open-set Recognition (FSOSR), Few-shot Class Incremental Learning (FSCIL), and Lifelong Few-shot Text-to-Image Diffusion. For FSOSR, we reserve space for unseen classes and leverage background features from seen classes as pseudo unseen classes to effectively learn decision boundaries. For FSCIL, we adopt a decoupled learning strategy that prevents knowledge forgetting by updating only classifiers during incremental sessions and introduce a Continually Evolved Classifier (CEC) using graph-based context propagation. In Lifelong Few-shot Text-to-Image Diffusion, we integrate data-free knowledge distillation and In-Context Generation (ICGen) to continuously generate high-quality images from limited examples while retaining prior knowledge. Extensive experiments on benchmark datasets demonstrate that these frameworks significantly improve adaptability and efficiency in dynamic environments, setting new state-of-the-art results. This thesis advances both theoretical and practical aspects of few-shot learning, enabling robust and scalable AI systems for real-world applications. |
author2 |
Lin Guosheng |
author_facet |
Lin Guosheng Song, Nan |
format |
Thesis-Doctor of Philosophy |
author |
Song, Nan |
author_sort |
Song, Nan |
title |
Few-shot vision recognition and generation for the open-world |
title_short |
Few-shot vision recognition and generation for the open-world |
title_full |
Few-shot vision recognition and generation for the open-world |
title_fullStr |
Few-shot vision recognition and generation for the open-world |
title_full_unstemmed |
Few-shot vision recognition and generation for the open-world |
title_sort |
few-shot vision recognition and generation for the open-world |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181293 |
_version_ |
1819112932356653056 |