Generative adversarial network (GAN) for image synthesis

Recently, Conditional generative adversarial network (cGAN) plays an important role in image synthesis tasks and Vision Transformer (ViT) with self-attention mechanism have shown SOTA performance on computer vision field. In this report, I extent ViT to image synthesis tasks. I propose two ViT-based...

全面介紹

Saved in:
書目詳細資料
主要作者: Hou, Boyu
其他作者: Wen Bihan
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2022
主題:
在線閱讀:https://hdl.handle.net/10356/158045
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
id sg-ntu-dr.10356-158045
record_format dspace
spelling sg-ntu-dr.10356-1580452023-07-07T19:22:27Z Generative adversarial network (GAN) for image synthesis Hou, Boyu Wen Bihan School of Electrical and Electronic Engineering bihan.wen@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Recently, Conditional generative adversarial network (cGAN) plays an important role in image synthesis tasks and Vision Transformer (ViT) with self-attention mechanism have shown SOTA performance on computer vision field. In this report, I extent ViT to image synthesis tasks. I propose two ViT-based generator architectures with upsampling and transposed convolution encoders and one ViT-based discriminator. I demonstrate that my models, named cViTGAN, are capable of image synthesis task. I perform experiments on six different benchmarks, the models achieve comparable performance to the baseline models. My work shows that we can achieve reasonable results with ViT-based models. Bachelor of Engineering (Electrical and Electronic Engineering) 2022-05-26T06:49:48Z 2022-05-26T06:49:48Z 2022 Final Year Project (FYP) Hou, B. (2022). Generative adversarial network (GAN) for image synthesis. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/158045 https://hdl.handle.net/10356/158045 en A3277-211 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Hou, Boyu
Generative adversarial network (GAN) for image synthesis
description Recently, Conditional generative adversarial network (cGAN) plays an important role in image synthesis tasks and Vision Transformer (ViT) with self-attention mechanism have shown SOTA performance on computer vision field. In this report, I extent ViT to image synthesis tasks. I propose two ViT-based generator architectures with upsampling and transposed convolution encoders and one ViT-based discriminator. I demonstrate that my models, named cViTGAN, are capable of image synthesis task. I perform experiments on six different benchmarks, the models achieve comparable performance to the baseline models. My work shows that we can achieve reasonable results with ViT-based models.
author2 Wen Bihan
author_facet Wen Bihan
Hou, Boyu
format Final Year Project
author Hou, Boyu
author_sort Hou, Boyu
title Generative adversarial network (GAN) for image synthesis
title_short Generative adversarial network (GAN) for image synthesis
title_full Generative adversarial network (GAN) for image synthesis
title_fullStr Generative adversarial network (GAN) for image synthesis
title_full_unstemmed Generative adversarial network (GAN) for image synthesis
title_sort generative adversarial network (gan) for image synthesis
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/158045
_version_ 1772825746602983424