Few-shot vision recognition and generation for the open-world

Deep Neural Networks (DNNs) have achieved remarkable success across various computer vision tasks, but their reliance on extensive labeled datasets limits their applicability in data-scarce scenarios. Few-shot learning offers a promising solution by enabling models to learn from minimal data, yet tr...

Full description

Saved in:

Bibliographic Details
Main Author:	Song, Nan
Other Authors:	Lin Guosheng
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Few-shot learning
Online Access:	https://hdl.handle.net/10356/181293
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-181293
record_format	dspace
spelling	sg-ntu-dr.10356-1812932024-12-03T05:20:50Z Few-shot vision recognition and generation for the open-world Song, Nan Lin Guosheng College of Computing and Data Science gslin@ntu.edu.sg Computer and Information Science Few-shot learning Deep Neural Networks (DNNs) have achieved remarkable success across various computer vision tasks, but their reliance on extensive labeled datasets limits their applicability in data-scarce scenarios. Few-shot learning offers a promising solution by enabling models to learn from minimal data, yet traditional approaches assume a closed set of classes, which is impractical in open-world settings. This thesis addresses the challenges of few-shot learning in an open-world context by introducing three novel frameworks: Few-shot Open-set Recognition (FSOSR), Few-shot Class Incremental Learning (FSCIL), and Lifelong Few-shot Text-to-Image Diffusion. For FSOSR, we reserve space for unseen classes and leverage background features from seen classes as pseudo unseen classes to effectively learn decision boundaries. For FSCIL, we adopt a decoupled learning strategy that prevents knowledge forgetting by updating only classifiers during incremental sessions and introduce a Continually Evolved Classifier (CEC) using graph-based context propagation. In Lifelong Few-shot Text-to-Image Diffusion, we integrate data-free knowledge distillation and In-Context Generation (ICGen) to continuously generate high-quality images from limited examples while retaining prior knowledge. Extensive experiments on benchmark datasets demonstrate that these frameworks significantly improve adaptability and efficiency in dynamic environments, setting new state-of-the-art results. This thesis advances both theoretical and practical aspects of few-shot learning, enabling robust and scalable AI systems for real-world applications. Doctor of Philosophy 2024-11-24T23:54:37Z 2024-11-24T23:54:37Z 2024 Thesis-Doctor of Philosophy Song, N. (2024). Few-shot vision recognition and generation for the open-world. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181293 https://hdl.handle.net/10356/181293 10.32657/10356/181293 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Few-shot learning
spellingShingle	Computer and Information Science Few-shot learning Song, Nan Few-shot vision recognition and generation for the open-world
description	Deep Neural Networks (DNNs) have achieved remarkable success across various computer vision tasks, but their reliance on extensive labeled datasets limits their applicability in data-scarce scenarios. Few-shot learning offers a promising solution by enabling models to learn from minimal data, yet traditional approaches assume a closed set of classes, which is impractical in open-world settings. This thesis addresses the challenges of few-shot learning in an open-world context by introducing three novel frameworks: Few-shot Open-set Recognition (FSOSR), Few-shot Class Incremental Learning (FSCIL), and Lifelong Few-shot Text-to-Image Diffusion. For FSOSR, we reserve space for unseen classes and leverage background features from seen classes as pseudo unseen classes to effectively learn decision boundaries. For FSCIL, we adopt a decoupled learning strategy that prevents knowledge forgetting by updating only classifiers during incremental sessions and introduce a Continually Evolved Classifier (CEC) using graph-based context propagation. In Lifelong Few-shot Text-to-Image Diffusion, we integrate data-free knowledge distillation and In-Context Generation (ICGen) to continuously generate high-quality images from limited examples while retaining prior knowledge. Extensive experiments on benchmark datasets demonstrate that these frameworks significantly improve adaptability and efficiency in dynamic environments, setting new state-of-the-art results. This thesis advances both theoretical and practical aspects of few-shot learning, enabling robust and scalable AI systems for real-world applications.
author2	Lin Guosheng
author_facet	Lin Guosheng Song, Nan
format	Thesis-Doctor of Philosophy
author	Song, Nan
author_sort	Song, Nan
title	Few-shot vision recognition and generation for the open-world
title_short	Few-shot vision recognition and generation for the open-world
title_full	Few-shot vision recognition and generation for the open-world
title_fullStr	Few-shot vision recognition and generation for the open-world
title_full_unstemmed	Few-shot vision recognition and generation for the open-world
title_sort	few-shot vision recognition and generation for the open-world
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/181293
_version_	1819112932356653056

Few-shot vision recognition and generation for the open-world

Similar Items