Text restoration using image super resolution

Text Recognition and scene text recognition have gained high prominence with the emergence of advanced deep learning techniques, such as CNNs. However, when the scene data is of low resolution, most models fail to provide accurate results. To this extent, super resolution is proposed as a pre proces...

Full description

Saved in:
Bibliographic Details
Main Author: Bodipati, Kiran
Other Authors: Chen Change Loy
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/166103
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Text Recognition and scene text recognition have gained high prominence with the emergence of advanced deep learning techniques, such as CNNs. However, when the scene data is of low resolution, most models fail to provide accurate results. To this extent, super resolution is proposed as a pre processing technique to improve the resolution of the images. Traditional Super Resolution models are developed for natural scenes and tend to fail in the case of scene text, due to several characteristics of the text that make it challenging for text super resolution. The lack of high quality datasets for this task is a factor in the poor performance of existing models. In our study, we provide a comprehensive review of existing super resolution techniques and the techniques specific to the context of scene text data. In this study, we build a new practical dataset that can be used to this extent. We create high resolution synthetic text data and collect high resolution images crawling the web. The corresponding low resolution images are created using a practical higher order degradation model. We train on the architecture of Real-ESRGAN and provide a qualitative and qualitative study of the datasets proposed and demonstrate the performance of the new models. Comparisons against the pre-trained Real-ESRGAN model is provided. The limitations of the proposed datasets and models are discussed.