Cycle-consistent inverse GAN for text-to-image synthesis

This paper investigates an open research task of text-to-image synthesis for automatically generating or manipulating images from text descriptions. Prevailing methods mainly take the textual descriptions as the conditional input for the GAN generation, and need to train different models for the tex...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wang, Hao, Lin, Guosheng, Hoi, Steven C. H., Miao, Chunyan
Other Authors:	School of Computer Science and Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2022
Subjects:	Engineering::Computer science and engineering Text-to-Image Synthesis Cycle consistency
Online Access:	https://hdl.handle.net/10356/156034
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-156034
record_format	dspace
spelling	sg-ntu-dr.10356-1560342022-04-01T06:07:10Z Cycle-consistent inverse GAN for text-to-image synthesis Wang, Hao Lin, Guosheng Hoi, Steven C. H. Miao, Chunyan School of Computer Science and Engineering 29th ACM International Conference on Multimedia (MM '21) Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly (LILY) Engineering::Computer science and engineering Text-to-Image Synthesis Cycle consistency This paper investigates an open research task of text-to-image synthesis for automatically generating or manipulating images from text descriptions. Prevailing methods mainly take the textual descriptions as the conditional input for the GAN generation, and need to train different models for the text-guided image generation and manipulation tasks. In this paper, we propose a novel unified framework of Cycle-consistent Inverse GAN (CI-GAN) for both text-to-image generation and text-guided image manipulation tasks. Specifically, we first train a GAN model without text input, aiming to generate images with high diversity and quality. Then we learn a GAN inversion model to convert the images back to the GAN latent space and obtain the inverted latent codes for each image, where we introduce the cycle-consistency training to learn more robust and consistent inverted latent codes. We further uncover the semantics of the latent space of the trained GAN model, by learning a similarity model between text representations and the latent codes. In the text-guided optimization module, we can generate images with the desired semantic attributes through optimization on the inverted latent codes. Extensive experiments on the Recipe1M and CUB datasets validate the efficacy of our proposed framework. AI Singapore Ministry of Education (MOE) Ministry of Health (MOH) National Research Foundation (NRF) Submitted/Accepted version This research is supported, in part, by the National Research Foundation (NRF), Singapore under its AI Singapore Programme (AISG Award No: AISG-GC-2019-003) and under its NRF Investigatorship Programme (NRFI Award No. NRF-NRFI05-2019-0002). This research is also supported, in part, by the Singapore Ministry of Health under its National Innovation Challenge on Active and Confident Ageing (NIC Project No. MOH/NIC/COG04/2017 and MOH/NIC/HAIG03/2017), and the MOE Tier-1 research grants: RG28/18 (S) and RG22/19 (S). 2022-04-01T06:07:10Z 2022-04-01T06:07:10Z 2021 Conference Paper Wang, H., Lin, G., Hoi, S. C. H. & Miao, C. (2021). Cycle-consistent inverse GAN for text-to-image synthesis. 29th ACM International Conference on Multimedia (MM '21), 630-638. https://dx.doi.org/10.1145/3474085.3475226 9781450386517 https://hdl.handle.net/10356/156034 10.1145/3474085.3475226 630 638 en AISG-GC-2019-003 NRF-NRFI05-2019-0002 MOH/NIC/COG04/2017 MOH/NIC/HAIG03/2017 RG28/18 (S) RG22/19 (S) © 2021 Association for Computing Machinery. All rights reserved. This paper was published in Proceedings of the 29th ACM International Conference on Multimedia (MM' 21) and is made available with permission of Association for Computing Machinery. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Text-to-Image Synthesis Cycle consistency
spellingShingle	Engineering::Computer science and engineering Text-to-Image Synthesis Cycle consistency Wang, Hao Lin, Guosheng Hoi, Steven C. H. Miao, Chunyan Cycle-consistent inverse GAN for text-to-image synthesis
description	This paper investigates an open research task of text-to-image synthesis for automatically generating or manipulating images from text descriptions. Prevailing methods mainly take the textual descriptions as the conditional input for the GAN generation, and need to train different models for the text-guided image generation and manipulation tasks. In this paper, we propose a novel unified framework of Cycle-consistent Inverse GAN (CI-GAN) for both text-to-image generation and text-guided image manipulation tasks. Specifically, we first train a GAN model without text input, aiming to generate images with high diversity and quality. Then we learn a GAN inversion model to convert the images back to the GAN latent space and obtain the inverted latent codes for each image, where we introduce the cycle-consistency training to learn more robust and consistent inverted latent codes. We further uncover the semantics of the latent space of the trained GAN model, by learning a similarity model between text representations and the latent codes. In the text-guided optimization module, we can generate images with the desired semantic attributes through optimization on the inverted latent codes. Extensive experiments on the Recipe1M and CUB datasets validate the efficacy of our proposed framework.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Wang, Hao Lin, Guosheng Hoi, Steven C. H. Miao, Chunyan
format	Conference or Workshop Item
author	Wang, Hao Lin, Guosheng Hoi, Steven C. H. Miao, Chunyan
author_sort	Wang, Hao
title	Cycle-consistent inverse GAN for text-to-image synthesis
title_short	Cycle-consistent inverse GAN for text-to-image synthesis
title_full	Cycle-consistent inverse GAN for text-to-image synthesis
title_fullStr	Cycle-consistent inverse GAN for text-to-image synthesis
title_full_unstemmed	Cycle-consistent inverse GAN for text-to-image synthesis
title_sort	cycle-consistent inverse gan for text-to-image synthesis
publishDate	2022
url	https://hdl.handle.net/10356/156034
_version_	1729789496009949184

Cycle-consistent inverse GAN for text-to-image synthesis

Similar Items