Beyond textual constraints : Learning novel diffusion conditions with fewer examples

In this paper, we delve into a novel aspect of learning novel diffusion conditions with datasets an order of magnitude smaller. The rationale behind our approach is the elimination of textual constraints during the few-shot learning process. To that end, we implement two optimization strategies. The...

Full description

Saved in:
Bibliographic Details
Main Authors: YU, Yuyang, LIU, Bangzhen, ZHENG, Chenxi, XU, Xuemiao, ZHANG, Huaidong, HE, Shengfeng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9774
https://ink.library.smu.edu.sg/context/sis_research/article/10774/viewcontent/Yu_Beyond_CVPR_2024_paper.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-10774
record_format dspace
spelling sg-smu-ink.sis_research-107742024-12-16T02:10:33Z Beyond textual constraints : Learning novel diffusion conditions with fewer examples YU, Yuyang LIU, Bangzhen ZHENG, Chenxi XU, Xuemiao ZHANG, Huaidong HE, Shengfeng In this paper, we delve into a novel aspect of learning novel diffusion conditions with datasets an order of magnitude smaller. The rationale behind our approach is the elimination of textual constraints during the few-shot learning process. To that end, we implement two optimization strategies. The first, prompt-free conditional learning, utilizes a prompt-free encoder derived from a pre-trained Stable Diffusion model. This strategy is designed to adapt new conditions to the diffusion process by minimizing the textual-visual cor-relation, thereby ensuring a more precise alignment between the generated content and the specified conditions. The second strategy entails condition-specific negative rectification, which addresses the inconsistencies typically brought about by Classifier-free guidance in few-shot training con-texts. Our extensive experiments across a variety of condition modalities demonstrate the effectiveness and efficiency of our framework, yielding results comparable to those obtained with datasets a thousand times larger. 2024-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9774 info:doi/10.1109/CVPR52733.2024.00679 https://ink.library.smu.edu.sg/context/sis_research/article/10774/viewcontent/Yu_Beyond_CVPR_2024_paper.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Prompt-free conditional learning Conditional negative rectification Training Computer vision Adaptation models Codes Text to image Diffusion processes Diffusion model Image synthesis Controllable image generation Artificial Intelligence and Robotics Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Prompt-free conditional learning
Conditional negative rectification
Training
Computer vision
Adaptation models
Codes
Text to image
Diffusion processes
Diffusion model
Image synthesis
Controllable image generation
Artificial Intelligence and Robotics
Graphics and Human Computer Interfaces
spellingShingle Prompt-free conditional learning
Conditional negative rectification
Training
Computer vision
Adaptation models
Codes
Text to image
Diffusion processes
Diffusion model
Image synthesis
Controllable image generation
Artificial Intelligence and Robotics
Graphics and Human Computer Interfaces
YU, Yuyang
LIU, Bangzhen
ZHENG, Chenxi
XU, Xuemiao
ZHANG, Huaidong
HE, Shengfeng
Beyond textual constraints : Learning novel diffusion conditions with fewer examples
description In this paper, we delve into a novel aspect of learning novel diffusion conditions with datasets an order of magnitude smaller. The rationale behind our approach is the elimination of textual constraints during the few-shot learning process. To that end, we implement two optimization strategies. The first, prompt-free conditional learning, utilizes a prompt-free encoder derived from a pre-trained Stable Diffusion model. This strategy is designed to adapt new conditions to the diffusion process by minimizing the textual-visual cor-relation, thereby ensuring a more precise alignment between the generated content and the specified conditions. The second strategy entails condition-specific negative rectification, which addresses the inconsistencies typically brought about by Classifier-free guidance in few-shot training con-texts. Our extensive experiments across a variety of condition modalities demonstrate the effectiveness and efficiency of our framework, yielding results comparable to those obtained with datasets a thousand times larger.
format text
author YU, Yuyang
LIU, Bangzhen
ZHENG, Chenxi
XU, Xuemiao
ZHANG, Huaidong
HE, Shengfeng
author_facet YU, Yuyang
LIU, Bangzhen
ZHENG, Chenxi
XU, Xuemiao
ZHANG, Huaidong
HE, Shengfeng
author_sort YU, Yuyang
title Beyond textual constraints : Learning novel diffusion conditions with fewer examples
title_short Beyond textual constraints : Learning novel diffusion conditions with fewer examples
title_full Beyond textual constraints : Learning novel diffusion conditions with fewer examples
title_fullStr Beyond textual constraints : Learning novel diffusion conditions with fewer examples
title_full_unstemmed Beyond textual constraints : Learning novel diffusion conditions with fewer examples
title_sort beyond textual constraints : learning novel diffusion conditions with fewer examples
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/sis_research/9774
https://ink.library.smu.edu.sg/context/sis_research/article/10774/viewcontent/Yu_Beyond_CVPR_2024_paper.pdf
_version_ 1819113134923710464