Unsupervised learning with diffusion models

In computer vision, a key goal is to obtain visual representations that faithfully capture the underlying structure and semantics of the data, encompassing object identities, positions, textures, and lighting conditions. However, existing methods for un-/self-supervised learning (SSL) are restricted...

Full description

Saved in:
Bibliographic Details
Main Author: Wang, Jiankun
Other Authors: Weichen Liu
Format: Thesis-Master by Research
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/171953
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-171953
record_format dspace
spelling sg-ntu-dr.10356-1719532023-12-01T01:52:37Z Unsupervised learning with diffusion models Wang, Jiankun Weichen Liu School of Computer Science and Engineering liu@ntu.edu.sg Engineering::Computer science and engineering In computer vision, a key goal is to obtain visual representations that faithfully capture the underlying structure and semantics of the data, encompassing object identities, positions, textures, and lighting conditions. However, existing methods for un-/self-supervised learning (SSL) are restricted to untangling basic augmentation attributes such as rotation and color modification, which constrains their capacity to efficiently modularize the underlying semantics. In the thesis, we propose DiffSiam, a novel SSL framework that incorporates a disentangled representation learning algorithm based on diffusion models. By introducing additional Gaussian noises during the diffusion forward process, DiffSiam collapses samples with similar attributes, intensifying the attribute loss. To compensate, we learn an expanding set of modular features over time, adhering to the reconstruction of the Diffusion Model. This training dynamics biases the learned features towards disentangling diverse semantics, from fine-grained to coarse-grained attributes. Experimental results demonstrate the superior performance of DiffSiam on various classification benchmarks and generative tasks, validating its effectiveness in generating disentangled representations. Master of Engineering 2023-11-17T04:16:22Z 2023-11-17T04:16:22Z 2023 Thesis-Master by Research Wang, J. (2023). Unsupervised learning with diffusion models. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/171953 https://hdl.handle.net/10356/171953 10.32657/10356/171953 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Wang, Jiankun
Unsupervised learning with diffusion models
description In computer vision, a key goal is to obtain visual representations that faithfully capture the underlying structure and semantics of the data, encompassing object identities, positions, textures, and lighting conditions. However, existing methods for un-/self-supervised learning (SSL) are restricted to untangling basic augmentation attributes such as rotation and color modification, which constrains their capacity to efficiently modularize the underlying semantics. In the thesis, we propose DiffSiam, a novel SSL framework that incorporates a disentangled representation learning algorithm based on diffusion models. By introducing additional Gaussian noises during the diffusion forward process, DiffSiam collapses samples with similar attributes, intensifying the attribute loss. To compensate, we learn an expanding set of modular features over time, adhering to the reconstruction of the Diffusion Model. This training dynamics biases the learned features towards disentangling diverse semantics, from fine-grained to coarse-grained attributes. Experimental results demonstrate the superior performance of DiffSiam on various classification benchmarks and generative tasks, validating its effectiveness in generating disentangled representations.
author2 Weichen Liu
author_facet Weichen Liu
Wang, Jiankun
format Thesis-Master by Research
author Wang, Jiankun
author_sort Wang, Jiankun
title Unsupervised learning with diffusion models
title_short Unsupervised learning with diffusion models
title_full Unsupervised learning with diffusion models
title_fullStr Unsupervised learning with diffusion models
title_full_unstemmed Unsupervised learning with diffusion models
title_sort unsupervised learning with diffusion models
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/171953
_version_ 1784855535421489152