Image synthesis in visual machine learning

Image synthesis aims to generate realistic and high-fidelity images automatically. It has attracted increasing interests from both academia and industrial communities in recent years due to its wide applications in various artificial intelligence (AI) tasks as well as the recent advance of generativ...

Full description

Saved in:

Bibliographic Details
Main Author:	Zhan, Fangneng
Other Authors:	Lu Shijian
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2021
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/148667
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-148667
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Zhan, Fangneng Image synthesis in visual machine learning
description	Image synthesis aims to generate realistic and high-fidelity images automatically. It has attracted increasing interests from both academia and industrial communities in recent years due to its wide applications in various artificial intelligence (AI) tasks as well as the recent advance of generative adversarial networks (GANs). Specifically, image synthesis could generate realistic images of different objects and scenes, hence forms one fundamental component in various design tasks for automated generation of artworks, fashion, advertisement posters, etc. In addition, image synthesis could generate self-annotated images which can be directly applied to deep model training, hence alleviate the data constraint as deep neural networks usually require large amounts of annotated training images that are expensive and time-consuming to collect manually. However, automated generation of realistic and self-annotated images is still facing two major challenges. First, the synthesis realism requires the presence of both image appearance (e.g. image colors, brightness, styles, etc.) and image geometry (e.g. object sizes, alignment, perspective, etc.). Second, generating self-annotated and useful training images is still an open research topic and most existing generation networks cannot handle it well due to the lack of diversity in their generated images. We investigate automated image synthesis that aims to generate self-annotated and realistic images for either visual design tasks or effective training of deep neural networks. Our works can be broadly grouped into three parts. The first part is composition-based image synthesis that generates realistic and self-annotated images by embedding foreground object into background images automatically. Unlike most existing GAN-based generation networks, it can generate new images with superior diversity as well as new information as the foreground object and background image could come from different sources with completely different distributions. We developed three novel image composition techniques to tackle the challenge of composition-based synthesis that requires to embed foreground objects at the right locations with the right appearance and geometry within the background image automatically. The second part is translation-based image synthesis that aims to modify existing images to certain new forms that are more useful in deep network training. Unlike most existing image-to-image translation networks that focus on adaptation of image styles and appearance, we developed two image translation networks that adapt image geometries including global image viewpoints and local instance-level object shapes, respectively. The challenge is how to design networks to estimate reliable geometric transformation that often greatly affects the geometry of the translated images. The third part focuses on 3-dimensional (3D) image synthesis, a new problem that aims to embed 3D virtual object models into 2-dimensional (2D) natural images realistically. To ensure that the embedded 3D object models have realistic appearance, we design novel networks that first estimate the environmental lighting of the natural images and then re-light (or render) the embedded 3D objects to have harmonious brightness, color, shadows, etc. with respect to the natural images. Extensive experiments with different evaluation metrics show that our proposed synthesis networks can generate images with superior fidelity and realism. In addition, we demonstrate that our synthesized images are self-annotated and can be directly applied to train deep neural networks for various computer vision tasks effectively.
author2	Lu Shijian
author_facet	Lu Shijian Zhan, Fangneng
format	Thesis-Doctor of Philosophy
author	Zhan, Fangneng
author_sort	Zhan, Fangneng
title	Image synthesis in visual machine learning
title_short	Image synthesis in visual machine learning
title_full	Image synthesis in visual machine learning
title_fullStr	Image synthesis in visual machine learning
title_full_unstemmed	Image synthesis in visual machine learning
title_sort	image synthesis in visual machine learning
publisher	Nanyang Technological University
publishDate	2021
url	https://hdl.handle.net/10356/148667
_version_	1705151332815470592
spelling	sg-ntu-dr.10356-1486672021-07-08T16:00:36Z Image synthesis in visual machine learning Zhan, Fangneng Lu Shijian School of Computer Science and Engineering Shijian.Lu@ntu.edu.sg Engineering::Computer science and engineering Image synthesis aims to generate realistic and high-fidelity images automatically. It has attracted increasing interests from both academia and industrial communities in recent years due to its wide applications in various artificial intelligence (AI) tasks as well as the recent advance of generative adversarial networks (GANs). Specifically, image synthesis could generate realistic images of different objects and scenes, hence forms one fundamental component in various design tasks for automated generation of artworks, fashion, advertisement posters, etc. In addition, image synthesis could generate self-annotated images which can be directly applied to deep model training, hence alleviate the data constraint as deep neural networks usually require large amounts of annotated training images that are expensive and time-consuming to collect manually. However, automated generation of realistic and self-annotated images is still facing two major challenges. First, the synthesis realism requires the presence of both image appearance (e.g. image colors, brightness, styles, etc.) and image geometry (e.g. object sizes, alignment, perspective, etc.). Second, generating self-annotated and useful training images is still an open research topic and most existing generation networks cannot handle it well due to the lack of diversity in their generated images. We investigate automated image synthesis that aims to generate self-annotated and realistic images for either visual design tasks or effective training of deep neural networks. Our works can be broadly grouped into three parts. The first part is composition-based image synthesis that generates realistic and self-annotated images by embedding foreground object into background images automatically. Unlike most existing GAN-based generation networks, it can generate new images with superior diversity as well as new information as the foreground object and background image could come from different sources with completely different distributions. We developed three novel image composition techniques to tackle the challenge of composition-based synthesis that requires to embed foreground objects at the right locations with the right appearance and geometry within the background image automatically. The second part is translation-based image synthesis that aims to modify existing images to certain new forms that are more useful in deep network training. Unlike most existing image-to-image translation networks that focus on adaptation of image styles and appearance, we developed two image translation networks that adapt image geometries including global image viewpoints and local instance-level object shapes, respectively. The challenge is how to design networks to estimate reliable geometric transformation that often greatly affects the geometry of the translated images. The third part focuses on 3-dimensional (3D) image synthesis, a new problem that aims to embed 3D virtual object models into 2-dimensional (2D) natural images realistically. To ensure that the embedded 3D object models have realistic appearance, we design novel networks that first estimate the environmental lighting of the natural images and then re-light (or render) the embedded 3D objects to have harmonious brightness, color, shadows, etc. with respect to the natural images. Extensive experiments with different evaluation metrics show that our proposed synthesis networks can generate images with superior fidelity and realism. In addition, we demonstrate that our synthesized images are self-annotated and can be directly applied to train deep neural networks for various computer vision tasks effectively. Doctor of Philosophy 2021-05-06T03:29:36Z 2021-05-06T03:29:36Z 2021 Thesis-Doctor of Philosophy Zhan, F. (2021). Image synthesis in visual machine learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148667 https://hdl.handle.net/10356/148667 10.32657/10356/148667 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Image synthesis in visual machine learning

Similar Items