Image and video generation via deep learning
Despite the immense success in image and video generation, several important problems still exist. This thesis aims at addressing the remaining challenges through advanced deep learning techniques. The first attempt is to construct a large-scale facial video dataset, DeeperForensics-1.0, to facilita...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/172067 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Despite the immense success in image and video generation, several important problems still exist. This thesis aims at addressing the remaining challenges through advanced deep learning techniques. The first attempt is to construct a large-scale facial video dataset, DeeperForensics-1.0, to facilitate the following research and prevent the negative impact of generated data via better video manipulation. After securing the countermeasures, a versatile Two-Stream Image-to-image Translation (TSIT) framework is proposed, which has high practical value. Besides, the thesis tackles the remaining issues through a more fundamental and theoretical study, focal frequency loss (FFL), a frequency-level loss function that is complementary to existing spatial losses. The thesis further introduces Adaptive Pseudo Augmentation (APA) for GAN training with limited data, reducing the data requirements. Extensive experiments and analyses showcase the effectiveness of the proposed methods in both perceptual quality and quantitative evaluations. Finally, the thesis envisions potential future work, offering more insights into this field. |
---|