Image and video generation via deep learning
Despite the immense success in image and video generation, several important problems still exist. This thesis aims at addressing the remaining challenges through advanced deep learning techniques. The first attempt is to construct a large-scale facial video dataset, DeeperForensics-1.0, to facilita...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/172067 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-172067 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1720672023-12-01T01:52:37Z Image and video generation via deep learning Jiang, Liming Chen Change Loy School of Computer Science and Engineering ccloy@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Despite the immense success in image and video generation, several important problems still exist. This thesis aims at addressing the remaining challenges through advanced deep learning techniques. The first attempt is to construct a large-scale facial video dataset, DeeperForensics-1.0, to facilitate the following research and prevent the negative impact of generated data via better video manipulation. After securing the countermeasures, a versatile Two-Stream Image-to-image Translation (TSIT) framework is proposed, which has high practical value. Besides, the thesis tackles the remaining issues through a more fundamental and theoretical study, focal frequency loss (FFL), a frequency-level loss function that is complementary to existing spatial losses. The thesis further introduces Adaptive Pseudo Augmentation (APA) for GAN training with limited data, reducing the data requirements. Extensive experiments and analyses showcase the effectiveness of the proposed methods in both perceptual quality and quantitative evaluations. Finally, the thesis envisions potential future work, offering more insights into this field. Doctor of Philosophy 2023-11-21T06:50:11Z 2023-11-21T06:50:11Z 2023 Thesis-Doctor of Philosophy Jiang, L. (2023). Image and video generation via deep learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/172067 https://hdl.handle.net/10356/172067 10.32657/10356/172067 en NTU NAP This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Jiang, Liming Image and video generation via deep learning |
description |
Despite the immense success in image and video generation, several important problems still exist. This thesis aims at addressing the remaining challenges through advanced deep learning techniques. The first attempt is to construct a large-scale facial video dataset, DeeperForensics-1.0, to facilitate the following research and prevent the negative impact of generated data via better video manipulation. After securing the countermeasures, a versatile Two-Stream Image-to-image Translation (TSIT) framework is proposed, which has high practical value. Besides, the thesis tackles the remaining issues through a more fundamental and theoretical study, focal frequency loss (FFL), a frequency-level loss function that is complementary to existing spatial losses. The thesis further introduces Adaptive Pseudo Augmentation (APA) for GAN training with limited data, reducing the data requirements. Extensive experiments and analyses showcase the effectiveness of the proposed methods in both perceptual quality and quantitative evaluations. Finally, the thesis envisions potential future work, offering more insights into this field. |
author2 |
Chen Change Loy |
author_facet |
Chen Change Loy Jiang, Liming |
format |
Thesis-Doctor of Philosophy |
author |
Jiang, Liming |
author_sort |
Jiang, Liming |
title |
Image and video generation via deep learning |
title_short |
Image and video generation via deep learning |
title_full |
Image and video generation via deep learning |
title_fullStr |
Image and video generation via deep learning |
title_full_unstemmed |
Image and video generation via deep learning |
title_sort |
image and video generation via deep learning |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/172067 |
_version_ |
1784855576564465664 |