Exploring diffusion time-steps for unsupervised representation learning

Representation learning is all about discovering the hidden modular attributes that generate the data faithfully. We explore the potential of Denoising Diffusion Probabilistic Model (DM) in unsupervised learning of the modular attributes. We build a theoretical framework that connects the diffusion...

Full description

Saved in:
Bibliographic Details
Main Authors: YUE, Zhongqi, WANG, Jiankun, SUN, Qianru, JI, Lei, CHANG, Eric I-Chao, ZHANG, Hanwang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9215
https://ink.library.smu.edu.sg/context/sis_research/article/10221/viewcontent/3386_exploring_diffusion_time_steps__1_.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-10221
record_format dspace
spelling sg-smu-ink.sis_research-102212024-08-15T07:47:13Z Exploring diffusion time-steps for unsupervised representation learning YUE, Zhongqi WANG, Jiankun SUN, Qianru JI, Lei CHANG, Eric I-Chao ZHANG, Hanwang Representation learning is all about discovering the hidden modular attributes that generate the data faithfully. We explore the potential of Denoising Diffusion Probabilistic Model (DM) in unsupervised learning of the modular attributes. We build a theoretical framework that connects the diffusion time-steps and the hidden attributes, which serves as an effective inductive bias for unsupervised learning. Specifically, the forward diffusion process incrementally adds Gaussian noise to samples at each time-step, which essentially collapses different samples into similar ones by losing attributes, e.g., fine-grained attributes such as texture are lost with less noise added (i.e., early time-steps), while coarse-grained ones such as shape are lost by adding more noise (i.e., late time-steps). To disentangle the modular attributes, at each time-step t, we learn a t-specific feature to compensate for the newly lost attribute, and the set of all {1,...,t}-specific features, corresponding to the cumulative set of lost attributes, are trained to make up for the reconstruction error of a pre-trained DM at time-step t. On CelebA, FFHQ, and Bedroom datasets, the learned feature significantly improves attribute classification and enables faithful counterfactual generation, e.g., interpolating only one specified attribute between two images, validating the disentanglement quality. Codes are in https://github.com/yue-zhongqi/diti. 2024-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9215 https://ink.library.smu.edu.sg/context/sis_research/article/10221/viewcontent/3386_exploring_diffusion_time_steps__1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Graphics and Human Computer Interfaces
spellingShingle Graphics and Human Computer Interfaces
YUE, Zhongqi
WANG, Jiankun
SUN, Qianru
JI, Lei
CHANG, Eric I-Chao
ZHANG, Hanwang
Exploring diffusion time-steps for unsupervised representation learning
description Representation learning is all about discovering the hidden modular attributes that generate the data faithfully. We explore the potential of Denoising Diffusion Probabilistic Model (DM) in unsupervised learning of the modular attributes. We build a theoretical framework that connects the diffusion time-steps and the hidden attributes, which serves as an effective inductive bias for unsupervised learning. Specifically, the forward diffusion process incrementally adds Gaussian noise to samples at each time-step, which essentially collapses different samples into similar ones by losing attributes, e.g., fine-grained attributes such as texture are lost with less noise added (i.e., early time-steps), while coarse-grained ones such as shape are lost by adding more noise (i.e., late time-steps). To disentangle the modular attributes, at each time-step t, we learn a t-specific feature to compensate for the newly lost attribute, and the set of all {1,...,t}-specific features, corresponding to the cumulative set of lost attributes, are trained to make up for the reconstruction error of a pre-trained DM at time-step t. On CelebA, FFHQ, and Bedroom datasets, the learned feature significantly improves attribute classification and enables faithful counterfactual generation, e.g., interpolating only one specified attribute between two images, validating the disentanglement quality. Codes are in https://github.com/yue-zhongqi/diti.
format text
author YUE, Zhongqi
WANG, Jiankun
SUN, Qianru
JI, Lei
CHANG, Eric I-Chao
ZHANG, Hanwang
author_facet YUE, Zhongqi
WANG, Jiankun
SUN, Qianru
JI, Lei
CHANG, Eric I-Chao
ZHANG, Hanwang
author_sort YUE, Zhongqi
title Exploring diffusion time-steps for unsupervised representation learning
title_short Exploring diffusion time-steps for unsupervised representation learning
title_full Exploring diffusion time-steps for unsupervised representation learning
title_fullStr Exploring diffusion time-steps for unsupervised representation learning
title_full_unstemmed Exploring diffusion time-steps for unsupervised representation learning
title_sort exploring diffusion time-steps for unsupervised representation learning
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/sis_research/9215
https://ink.library.smu.edu.sg/context/sis_research/article/10221/viewcontent/3386_exploring_diffusion_time_steps__1_.pdf
_version_ 1814047793344413696