Be a cartoonist : editing anime images using generative adversarial network
With the rise in popularity of generative models, many studies have started to look at furthering its applicability as well as its performance. One such application is in image-to-image translation which can be used to transform an image from domain A to domain B. However, in a scenario where the...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/156440 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | With the rise in popularity of generative models, many studies have started to look
at furthering its applicability as well as its performance. One such application is in
image-to-image translation which can be used to transform an image from domain A
to domain B. However, in a scenario where the domain differs greatly in structure
such as between real faces and cartoon faces, it can be difficult to perform high
quality translation while retaining original identities. Currently, some existing works
suggested the use of cycle consistency, few-shot training in image-to-image translation
pipelines, while others recommend layer swapping and freezing lower-resolution
generator layers on top of a well pretrained StyleGAN. However, these solutions
are ineffective in translating real faces to anime images due to the difference in
face structure. To address this problem, we introduce perceptual loss and featurebased multi-discriminators to supervise the training process with the help of the offthe-shelf StyleGAN trained on real image domain. This way we would be able to
retain the original identity of the face after translating the image into another anime
domain. We then explore anime image editing using closed-form factorisation to edit
semantic details such as expression, pose and hair styles. In this project, we also
explore StyleGAN compression by using knowledge distillation, since the StyleGAN
has millions of parameter and it is difficult to utilise StyleGAN model on edge devices
which have low computational budget. |
---|