導出完成 — 

UniD3: unified discrete diffusion for simultaneous vision-language generation

The recently developed discrete diffusion model performs extraordinarily well in generation tasks, especially in the text-to-image task, showing great potential for modeling multimodal signals. In this paper, we leverage these properties and present a unified multimodal generation model, which can p...

全面介紹

Saved in:
書目詳細資料
Main Authors: Hu, Minghui, Zheng, Chuanxia, Cham, Tat-Jen, Suganthan, Ponnuthurai Nagaratnam, Yang, Zuopeng, Zheng, Heliang, Wang, Chaoyue, Tao, Dacheng
其他作者: School of Computer Science and Engineering
格式: Conference or Workshop Item
語言:English
出版: 2023
主題:
在線閱讀:https://hdl.handle.net/10356/172665
https://openreview.net/forum?id=8JqINxA-2a
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English