Unsupervised learning with diffusion models
In computer vision, a key goal is to obtain visual representations that faithfully capture the underlying structure and semantics of the data, encompassing object identities, positions, textures, and lighting conditions. However, existing methods for un-/self-supervised learning (SSL) are restricted...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Research |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/171953 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-171953 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1719532023-12-01T01:52:37Z Unsupervised learning with diffusion models Wang, Jiankun Weichen Liu School of Computer Science and Engineering liu@ntu.edu.sg Engineering::Computer science and engineering In computer vision, a key goal is to obtain visual representations that faithfully capture the underlying structure and semantics of the data, encompassing object identities, positions, textures, and lighting conditions. However, existing methods for un-/self-supervised learning (SSL) are restricted to untangling basic augmentation attributes such as rotation and color modification, which constrains their capacity to efficiently modularize the underlying semantics. In the thesis, we propose DiffSiam, a novel SSL framework that incorporates a disentangled representation learning algorithm based on diffusion models. By introducing additional Gaussian noises during the diffusion forward process, DiffSiam collapses samples with similar attributes, intensifying the attribute loss. To compensate, we learn an expanding set of modular features over time, adhering to the reconstruction of the Diffusion Model. This training dynamics biases the learned features towards disentangling diverse semantics, from fine-grained to coarse-grained attributes. Experimental results demonstrate the superior performance of DiffSiam on various classification benchmarks and generative tasks, validating its effectiveness in generating disentangled representations. Master of Engineering 2023-11-17T04:16:22Z 2023-11-17T04:16:22Z 2023 Thesis-Master by Research Wang, J. (2023). Unsupervised learning with diffusion models. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/171953 https://hdl.handle.net/10356/171953 10.32657/10356/171953 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering |
spellingShingle |
Engineering::Computer science and engineering Wang, Jiankun Unsupervised learning with diffusion models |
description |
In computer vision, a key goal is to obtain visual representations that faithfully capture the underlying structure and semantics of the data, encompassing object identities, positions, textures, and lighting conditions. However, existing methods for un-/self-supervised learning (SSL) are restricted to untangling basic augmentation attributes such as rotation and color modification, which constrains their capacity to efficiently modularize the underlying semantics. In the thesis, we propose DiffSiam, a novel SSL framework that incorporates a disentangled representation learning algorithm based on diffusion models. By introducing additional Gaussian noises during the diffusion forward process, DiffSiam collapses samples with similar attributes, intensifying the attribute loss. To compensate, we learn an expanding set of modular features over time, adhering to the reconstruction of the Diffusion Model. This training dynamics biases the learned features towards disentangling diverse semantics, from fine-grained to coarse-grained attributes. Experimental results demonstrate the superior performance of DiffSiam on various classification benchmarks and generative tasks, validating its effectiveness in generating disentangled representations. |
author2 |
Weichen Liu |
author_facet |
Weichen Liu Wang, Jiankun |
format |
Thesis-Master by Research |
author |
Wang, Jiankun |
author_sort |
Wang, Jiankun |
title |
Unsupervised learning with diffusion models |
title_short |
Unsupervised learning with diffusion models |
title_full |
Unsupervised learning with diffusion models |
title_fullStr |
Unsupervised learning with diffusion models |
title_full_unstemmed |
Unsupervised learning with diffusion models |
title_sort |
unsupervised learning with diffusion models |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/171953 |
_version_ |
1784855535421489152 |