Cocktail: mixing multi-modality controls for text-conditional image generation

Text-conditional diffusion models are able to generate high-fidelity images with diverse contents. However, linguistic representations frequently exhibit ambiguous descriptions of the envisioned objective imagery, requiring the incorporation of additional control signals to bolster the efficacy of t...

Full description

Saved in:
Bibliographic Details
Main Authors: Hu, Minghui, Zheng, Jianbin, Liu, Daqing, Zheng, Chuanxia, Wang, Chaoyue, Tao, Dacheng, Cham, Tat-Jen
Other Authors: School of Computer Science and Engineering
Format: Conference or Workshop Item
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/172668
https://nips.cc/virtual/2023/calendar
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English