Facial image synthesis

With the ability to carry out challenging tasks such as photo generation, Generative Adversarial Networks have attracted increasing attention and achieved impressive progress in recent years [6]. Researchers are also exploring its possible applications for more complex tasks such as facial image syn...

Full description

Saved in:
Bibliographic Details
Main Author: Shen, Guangxu
Other Authors: Lu Shijian
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156571
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With the ability to carry out challenging tasks such as photo generation, Generative Adversarial Networks have attracted increasing attention and achieved impressive progress in recent years [6]. Researchers are also exploring its possible applications for more complex tasks such as facial image synthesis. GAN has proven to have an outstanding performance in carrying out facial expression and attribute editing. A well-trained model could easily transform a facial image with one specific attribute/expression to another while preserving the identity information [7]. In this project, we will first discuss, in a very brief manner, the general problems that are faced by researchers in facial image synthesis. Subsequently, we will evaluate the common practices to solve those problems and their respective limitations. We will carry out an analysis on two advanced approaches, StarGan[2] and STGan[10], and discuss their respective ways to carry out facial image generation. . We will also explore the possibility of combining the best parts of these two models so that our designed facial expression GAN, CombineGAN, will be able to address both image feature transfer and quality issues. One possible way is to utilize STGan’s generator, built from a selective transfer perspective where Selective Transfer Units (STU) are built in the encoder-decoder generator architecture for it to adaptively choose and modify the encoder feature for an improved facial image synthesis. We will adopt evaluation metrics such as Inception Score (IS) and Frechet Inception Distance (FID) [18] to quantitively evaluate the model’s performance. We will also use qualitative method such as Amazon Mechanical Turk (AMT) [14] to evaluate the model performance from a human’s perspective. Lastly, our model will be applied to translate real life images for us to better understand its performance in a different context.