INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS

ABSTRACT INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS By Made Raharja Surya Mahadi NIM 23520022 (Master’s Program in Informatics) Text-to-image generation is the task of synthesizing text descriptions into a realistic image. Currently, many deep learning models on...

Full description

Saved in:
Bibliographic Details
Main Author: Raharja Surya Mahadi, Made
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/63340
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:63340
spelling id-itb.:633402022-01-31T09:13:26ZINDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS Raharja Surya Mahadi, Made Indonesia Theses Generative Adversarial Networks, Text-to-Image Synthesis INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/63340 ABSTRACT INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS By Made Raharja Surya Mahadi NIM 23520022 (Master’s Program in Informatics) Text-to-image generation is the task of synthesizing text descriptions into a realistic image. Currently, many deep learning models only use two models, text encoder and image generator. Text encoder is used to extract features from a given sentence, and image generators are used to generate a realistic image from the given feature. Research on this topic is quite hard. This is because of the domain gap between natural language and vision. Nowadays, most research on this topic only focuses on producing a photo-realistic image, but the other domain, in this case, is the language, which is less concentrated. A lot of the current research uses English as the input text. Besides, there are many languages around the world. Bahasa Indonesia, as the official language of Indonesia, is quite popular. This language has been taught in Philipines, Australia, and Japan. Translating or recreating a new dataset into another language with good quality will cost a lot. Research on this domain is necessary because we need to examine how the image generator performs in other languages besides still generating photo-realistic images. To achieve this, we translate the CUB dataset into Bahasa using google translate and manually by humans. We use Sentence BERT as the text encoder and FastGAN as the image generator. FastGAN uses lots of skip excitation modules and autoencoder to generate an image with resolution 512×512×3 which is twice as bigger as the current state-of-the-art model (Zhang dkk., 2019). We also get 4.76±0.43 and 46.401 on Inception Score and Fr´echet inception distance, respectively. These results indicate that the resulting image has a high object quality, and is not far from the text-to-image generation architecture which uses English. In addition, the survey results show that the image produced by conditional mode is roughly better than unconditional mode. iii Keywords: Generative Adversarial Networks, Text-to-Image Synthesis text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description ABSTRACT INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS By Made Raharja Surya Mahadi NIM 23520022 (Master’s Program in Informatics) Text-to-image generation is the task of synthesizing text descriptions into a realistic image. Currently, many deep learning models only use two models, text encoder and image generator. Text encoder is used to extract features from a given sentence, and image generators are used to generate a realistic image from the given feature. Research on this topic is quite hard. This is because of the domain gap between natural language and vision. Nowadays, most research on this topic only focuses on producing a photo-realistic image, but the other domain, in this case, is the language, which is less concentrated. A lot of the current research uses English as the input text. Besides, there are many languages around the world. Bahasa Indonesia, as the official language of Indonesia, is quite popular. This language has been taught in Philipines, Australia, and Japan. Translating or recreating a new dataset into another language with good quality will cost a lot. Research on this domain is necessary because we need to examine how the image generator performs in other languages besides still generating photo-realistic images. To achieve this, we translate the CUB dataset into Bahasa using google translate and manually by humans. We use Sentence BERT as the text encoder and FastGAN as the image generator. FastGAN uses lots of skip excitation modules and autoencoder to generate an image with resolution 512×512×3 which is twice as bigger as the current state-of-the-art model (Zhang dkk., 2019). We also get 4.76±0.43 and 46.401 on Inception Score and Fr´echet inception distance, respectively. These results indicate that the resulting image has a high object quality, and is not far from the text-to-image generation architecture which uses English. In addition, the survey results show that the image produced by conditional mode is roughly better than unconditional mode. iii Keywords: Generative Adversarial Networks, Text-to-Image Synthesis
format Theses
author Raharja Surya Mahadi, Made
spellingShingle Raharja Surya Mahadi, Made
INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS
author_facet Raharja Surya Mahadi, Made
author_sort Raharja Surya Mahadi, Made
title INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS
title_short INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS
title_full INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS
title_fullStr INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS
title_full_unstemmed INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS
title_sort indonesian text-to-image synthesis using generative adversarial networks
url https://digilib.itb.ac.id/gdl/view/63340
_version_ 1822932152903270400