Visual recognition using artificial intelligence (image inpainting with transformers)

As a major task in Computer Vision area, image inpainting is the process of filling in the missing part of an image. The traditional methods for image inpainting always struggle to address complex or large missing parts. In the last decade, the deep learning methods have made significant progress in...

Full description

Saved in:
Bibliographic Details
Main Author: Dou, Yuxiao
Other Authors: Yap Kim Hui
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/167299
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:As a major task in Computer Vision area, image inpainting is the process of filling in the missing part of an image. The traditional methods for image inpainting always struggle to address complex or large missing parts. In the last decade, the deep learning methods have made significant progress in this area. In this project, the potential of transformers in image inpainting is explored. Transformers have already demonstrated their outstanding global structure understanding ability in NLP, which could be quite useful in image inpainting as well. However, transformers’ computational inefficiency would be magnified when dealing with image data type. To overcome this weakness, a method combing both transformers and CNNs are explored and researched on. We achieve high performance in the Places2 and FFHQ datasets. Since the FFHQ dataset contains limited number of Asian-face images, an Asian-face dataset AFD-dataset was used to extend the application of the proposed method as well. To conclude, this project helps to further explore the possibility of transformers in image inpainting area and provides some useful data and information for the future research.