Cycle-consistent inverse GAN for text-to-image synthesis

This paper investigates an open research task of text-to-image synthesis for automatically generating or manipulating images from text descriptions. Prevailing methods mainly take the textual descriptions as the conditional input for the GAN generation, and need to train different models for the tex...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Wang, Hao, Lin, Guosheng, Hoi, Steven C. H., Miao, Chunyan
مؤلفون آخرون:	School of Computer Science and Engineering
التنسيق:	Conference or Workshop Item
اللغة:	English
منشور في:	2022
الموضوعات:	Engineering::Computer science and engineering Text-to-Image Synthesis Cycle consistency
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/156034
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

الوصف
الملخص:	This paper investigates an open research task of text-to-image synthesis for automatically generating or manipulating images from text descriptions. Prevailing methods mainly take the textual descriptions as the conditional input for the GAN generation, and need to train different models for the text-guided image generation and manipulation tasks. In this paper, we propose a novel unified framework of Cycle-consistent Inverse GAN (CI-GAN) for both text-to-image generation and text-guided image manipulation tasks. Specifically, we first train a GAN model without text input, aiming to generate images with high diversity and quality. Then we learn a GAN inversion model to convert the images back to the GAN latent space and obtain the inverted latent codes for each image, where we introduce the cycle-consistency training to learn more robust and consistent inverted latent codes. We further uncover the semantics of the latent space of the trained GAN model, by learning a similarity model between text representations and the latent codes. In the text-guided optimization module, we can generate images with the desired semantic attributes through optimization on the inverted latent codes. Extensive experiments on the Recipe1M and CUB datasets validate the efficacy of our proposed framework.

Cycle-consistent inverse GAN for text-to-image synthesis

مواد مشابهة