INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS
ABSTRACT INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS By Made Raharja Surya Mahadi NIM 23520022 (Master’s Program in Informatics) Text-to-image generation is the task of synthesizing text descriptions into a realistic image. Currently, many deep learning models on...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/63340 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:63340 |
---|---|
spelling |
id-itb.:633402022-01-31T09:13:26ZINDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS Raharja Surya Mahadi, Made Indonesia Theses Generative Adversarial Networks, Text-to-Image Synthesis INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/63340 ABSTRACT INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS By Made Raharja Surya Mahadi NIM 23520022 (Master’s Program in Informatics) Text-to-image generation is the task of synthesizing text descriptions into a realistic image. Currently, many deep learning models only use two models, text encoder and image generator. Text encoder is used to extract features from a given sentence, and image generators are used to generate a realistic image from the given feature. Research on this topic is quite hard. This is because of the domain gap between natural language and vision. Nowadays, most research on this topic only focuses on producing a photo-realistic image, but the other domain, in this case, is the language, which is less concentrated. A lot of the current research uses English as the input text. Besides, there are many languages around the world. Bahasa Indonesia, as the official language of Indonesia, is quite popular. This language has been taught in Philipines, Australia, and Japan. Translating or recreating a new dataset into another language with good quality will cost a lot. Research on this domain is necessary because we need to examine how the image generator performs in other languages besides still generating photo-realistic images. To achieve this, we translate the CUB dataset into Bahasa using google translate and manually by humans. We use Sentence BERT as the text encoder and FastGAN as the image generator. FastGAN uses lots of skip excitation modules and autoencoder to generate an image with resolution 512×512×3 which is twice as bigger as the current state-of-the-art model (Zhang dkk., 2019). We also get 4.76±0.43 and 46.401 on Inception Score and Fr´echet inception distance, respectively. These results indicate that the resulting image has a high object quality, and is not far from the text-to-image generation architecture which uses English. In addition, the survey results show that the image produced by conditional mode is roughly better than unconditional mode. iii Keywords: Generative Adversarial Networks, Text-to-Image Synthesis text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
ABSTRACT
INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE
ADVERSARIAL NETWORKS
By
Made Raharja Surya Mahadi
NIM 23520022
(Master’s Program in Informatics)
Text-to-image generation is the task of synthesizing text descriptions into a realistic
image. Currently, many deep learning models only use two models, text
encoder and image generator. Text encoder is used to extract features from a given
sentence, and image generators are used to generate a realistic image from the
given feature.
Research on this topic is quite hard. This is because of the domain gap between
natural language and vision. Nowadays, most research on this topic only focuses
on producing a photo-realistic image, but the other domain, in this case, is the
language, which is less concentrated. A lot of the current research uses English
as the input text. Besides, there are many languages around the world. Bahasa
Indonesia, as the official language of Indonesia, is quite popular. This language
has been taught in Philipines, Australia, and Japan. Translating or recreating a
new dataset into another language with good quality will cost a lot. Research on
this domain is necessary because we need to examine how the image generator
performs in other languages besides still generating photo-realistic images.
To achieve this, we translate the CUB dataset into Bahasa using google translate
and manually by humans. We use Sentence BERT as the text encoder and FastGAN
as the image generator. FastGAN uses lots of skip excitation modules and autoencoder
to generate an image with resolution 512×512×3 which is twice as bigger
as the current state-of-the-art model (Zhang dkk., 2019). We also get 4.76±0.43
and 46.401 on Inception Score and Fr´echet inception distance, respectively. These
results indicate that the resulting image has a high object quality, and is not far
from the text-to-image generation architecture which uses English. In addition, the
survey results show that the image produced by conditional mode is roughly better
than unconditional mode.
iii
Keywords: Generative Adversarial Networks, Text-to-Image Synthesis |
format |
Theses |
author |
Raharja Surya Mahadi, Made |
spellingShingle |
Raharja Surya Mahadi, Made INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS |
author_facet |
Raharja Surya Mahadi, Made |
author_sort |
Raharja Surya Mahadi, Made |
title |
INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS |
title_short |
INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS |
title_full |
INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS |
title_fullStr |
INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS |
title_full_unstemmed |
INDONESIAN TEXT-TO-IMAGE SYNTHESIS USING GENERATIVE ADVERSARIAL NETWORKS |
title_sort |
indonesian text-to-image synthesis using generative adversarial networks |
url |
https://digilib.itb.ac.id/gdl/view/63340 |
_version_ |
1822932152903270400 |