Training deep network models for accurate recognition of texts in scenes
Scene Text Recognition is an important task because of its many potential applications in the industries. However, Scene Text Recognition is also a challenging task in Computer Vision because of the irregularity and diversity of scene text images. Among these difficulties, low-resolution images are...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/148083 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-148083 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1480832021-04-22T13:12:06Z Training deep network models for accurate recognition of texts in scenes Chen, Cheng Lu Shijian School of Computer Science and Engineering cchen018@e.ntu.edu.sg, Shijian.Lu@ntu.edu.sg Engineering::Computer science and engineering Scene Text Recognition is an important task because of its many potential applications in the industries. However, Scene Text Recognition is also a challenging task in Computer Vision because of the irregularity and diversity of scene text images. Among these difficulties, low-resolution images are one of the major problems still yet to be perfectly solved. In this paper, a deep learning neural network specialised in scene text recognition is studied and implemented. Multiple ways to improvement model performance on low-resolution images are also investigated and compared. More specifically, two different strategies of handling low-resolution images are investigated: 1) Super-resolving images from feature level by incorporating a Super- Resolution Unit in the end-to-end trainable model; 2) Super-resolve images from image level through three different state-of-art super-resolution models. To ensure fair comparison, the TextZoom dataset is used throughout different rounds of experiments as it contains real-life low-resolution and high-resolution image pairs. Bachelor of Engineering (Computer Science) 2021-04-22T13:12:06Z 2021-04-22T13:12:06Z 2021 Final Year Project (FYP) Chen, C. (2021). Training deep network models for accurate recognition of texts in scenes. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148083 https://hdl.handle.net/10356/148083 en SCSE20-0118 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering |
spellingShingle |
Engineering::Computer science and engineering Chen, Cheng Training deep network models for accurate recognition of texts in scenes |
description |
Scene Text Recognition is an important task because of its many potential applications in the industries. However, Scene Text Recognition is also a challenging task in Computer Vision because of the irregularity and diversity of scene text images. Among these difficulties, low-resolution images are one of the major problems still yet to be perfectly solved. In this paper, a deep learning neural network specialised in scene text recognition is studied and implemented. Multiple ways to improvement model performance on low-resolution images are also investigated and compared. More specifically, two different strategies of handling low-resolution images are investigated: 1) Super-resolving images from feature level by incorporating a Super- Resolution Unit in the end-to-end trainable model; 2) Super-resolve images from image level through three different state-of-art super-resolution models. To ensure fair comparison, the TextZoom dataset is used throughout different rounds of experiments as it contains real-life low-resolution and high-resolution image pairs. |
author2 |
Lu Shijian |
author_facet |
Lu Shijian Chen, Cheng |
format |
Final Year Project |
author |
Chen, Cheng |
author_sort |
Chen, Cheng |
title |
Training deep network models for accurate recognition of texts in scenes |
title_short |
Training deep network models for accurate recognition of texts in scenes |
title_full |
Training deep network models for accurate recognition of texts in scenes |
title_fullStr |
Training deep network models for accurate recognition of texts in scenes |
title_full_unstemmed |
Training deep network models for accurate recognition of texts in scenes |
title_sort |
training deep network models for accurate recognition of texts in scenes |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/148083 |
_version_ |
1698713741081706496 |