Stuff segmentation and non-photorealistic rendering with generative adversarial networks

Generative Adversarial Networks (GANs) have shown impressive results in a variety of image generation tasks in recent years, including rendering photorealistic images with artistic styles. However, current work in transforming images have mostly focused on either transforming the whole image, or on...

Full description

Saved in:
Bibliographic Details
Main Author: Choy, Jin Xiang
Other Authors: Ong Yew Soon
Format: Thesis-Master by Research
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/137213
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Generative Adversarial Networks (GANs) have shown impressive results in a variety of image generation tasks in recent years, including rendering photorealistic images with artistic styles. However, current work in transforming images have mostly focused on either transforming the whole image, or on the thing classes. There have been little attention on the artistic rendering of only the stuff classes of images. Current possible methods of performing painting of specific image regions also result in unnatural boundaries between painted and non-painted regions. Therefore, we aim to develop an end-to-end model for the novel task of Non-Photorealistically Rendering the stuff of images. In order to train a model capable of doing so, we first require images with partially painted stuff classes for training. However, due to a lack of such images, we propose a flexible and extensible partially painted image generation pipeline that uses an image segmentation dataset to generate partially painted image datasets. We use these datasets with a GAN framework based on the Pix2Pix model for training. We find that the trained model can perform image stuff painting acceptably well, and it generated results that have more natural boundaries between painted and non-painted image regions. We then provide an analysis of the effects of the Pix2Pix architecture on the training task. We find that satisfactory results can also be obtained. We then discuss the limitations for this context and compare the performance to that of image stuff painting. We then introduce the Stuff-Painting GAN(SPGAN) for reducing errors in identifying image regions for painting by introducing segmentation masks into the training process. An additional discriminator that takes in segmentation masks as input is introduced to the architecture. We then show that it performs on par or better than the baseline GAN framework in identifying image regions for painting. To improve the segmentation learning we also introduce a Gaussian error correction in each training iteration.