Anime characters creation using generative adversarial networks with user inputs

Generative Adversarial Network (GAN) is a framework that has been used to generate realistic images of faces, objects, and even landscapes. With its increasing popularity, it can be used to generate anime facial images. Diffusion models have also recently been on the rise with models like Stable Dif...

Full description

Saved in:
Bibliographic Details
Main Author: Ang, Himari Lixin
Other Authors: Seah Hock Soon
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175300
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Generative Adversarial Network (GAN) is a framework that has been used to generate realistic images of faces, objects, and even landscapes. With its increasing popularity, it can be used to generate anime facial images. Diffusion models have also recently been on the rise with models like Stable Diffusion taking the center stage due to how realistic the images generated are. These models has potential applications in the entertainment industry for creating virtual worlds. The final year project aims to develop a GAN-based system for generating anime char- acters with user-defined attributes, allowing users to input desired characteristics such as hair color, eye shape, clothing style, and more to create customizable and unique anime character designs. In this project, GAN and diffusion models are proposed. For the GAN model, we explored an Auxiliary Classifier GAN (ACGAN) architecture to constraint the model such that users will be able to request specific attributes to be generated with their image. Similarly, for the diffusion model, we adopted the Denoising Diffusion Probabilistic Models (DDPM) framework with a UNet base model. To allow for attributes, an attribute mapper was designed to learn to map user-input attributes to the random noise that diffu- sion models use for generation. To tie the models together for the ease of use, a backend was created that allows the model to be deployed as a Model-as-a-Service (MaaS) with a NextJS frontend that interacts with it. Instead of having to work with the model directly, users will only have to interact with the Web Application. In the report, we show how we are able to generate anime character faces using both the GAN and diffusion model. At the same time, we present potential future works that can help further improve the model such as improving on the dataset for both the image and the attribute tags. Finally, we touch on the lifecycle of the project: architecture of our models, system architecture, implementation and deployment of the project.