Training deep network models for accurate recognition of texts in scenes

Scene Text Recognition is an important task because of its many potential applications in the industries. However, Scene Text Recognition is also a challenging task in Computer Vision because of the irregularity and diversity of scene text images. Among these difficulties, low-resolution images are...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Cheng
Other Authors: Lu Shijian
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148083
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Scene Text Recognition is an important task because of its many potential applications in the industries. However, Scene Text Recognition is also a challenging task in Computer Vision because of the irregularity and diversity of scene text images. Among these difficulties, low-resolution images are one of the major problems still yet to be perfectly solved. In this paper, a deep learning neural network specialised in scene text recognition is studied and implemented. Multiple ways to improvement model performance on low-resolution images are also investigated and compared. More specifically, two different strategies of handling low-resolution images are investigated: 1) Super-resolving images from feature level by incorporating a Super- Resolution Unit in the end-to-end trainable model; 2) Super-resolve images from image level through three different state-of-art super-resolution models. To ensure fair comparison, the TextZoom dataset is used throughout different rounds of experiments as it contains real-life low-resolution and high-resolution image pairs.