Training deep network models for accurate recognition of texts in scenes
Scene Text Recognition is a challenging research task in the domain of computer vision for many years due to dynamic conditions of text in natural scenes. The emergence of deep learning solutions created new possibilities and has also shown significant progress and performance by playing a role in v...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/147992 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-147992 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1479922021-04-22T02:26:14Z Training deep network models for accurate recognition of texts in scenes See, Yu Xiang Lu Shijian School of Computer Science and Engineering Shijian.Lu@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Scene Text Recognition is a challenging research task in the domain of computer vision for many years due to dynamic conditions of text in natural scenes. The emergence of deep learning solutions created new possibilities and has also shown significant progress and performance by playing a role in various vision-based applications. In this paper, deep network models will be implemented by emulating a state-of-the-art neural network architecture that utilize image-based sequence recognition for scene text recognition. Fine-Tuning of hyperparameters includes type of optimizers, learning rate, batch size and number of epochs to obtain the best configurations of the model for deployment by measuring against benchmark datasets. After doing so, the model’s configuration of using Adam optimizer was found to be performing better than the AdaDelta optimizer which was mentioned in the original paper. A text recognition program is also built to demonstrate the functionality of the trained model in real-time scenario. Further recommendation for this project includes exploring different methodologies to provide an end-to-end model capable of performing text detection and recognition on curved texts in scene images. Bachelor of Engineering (Computer Science) 2021-04-22T02:26:14Z 2021-04-22T02:26:14Z 2021 Final Year Project (FYP) See, Y. X. (2021). Training deep network models for accurate recognition of texts in scenes. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/147992 https://hdl.handle.net/10356/147992 en SCSE20-0113 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision See, Yu Xiang Training deep network models for accurate recognition of texts in scenes |
description |
Scene Text Recognition is a challenging research task in the domain of computer vision for many years due to dynamic conditions of text in natural scenes. The emergence of deep learning solutions created new possibilities and has also shown significant progress and performance by playing a role in various vision-based applications. In this paper, deep network models will be implemented by emulating a state-of-the-art neural network architecture that utilize image-based sequence recognition for scene text recognition. Fine-Tuning of hyperparameters includes type of optimizers, learning rate, batch size and number of epochs to obtain the best configurations of the model for deployment by measuring against benchmark datasets. After doing so, the model’s configuration of using Adam optimizer was found to be performing better than the AdaDelta optimizer which was mentioned in the original paper. A text recognition program is also built to demonstrate the functionality of the trained model in real-time scenario. Further recommendation for this project includes exploring different methodologies to provide an end-to-end model capable of performing text detection and recognition on curved texts in scene images. |
author2 |
Lu Shijian |
author_facet |
Lu Shijian See, Yu Xiang |
format |
Final Year Project |
author |
See, Yu Xiang |
author_sort |
See, Yu Xiang |
title |
Training deep network models for accurate recognition of texts in scenes |
title_short |
Training deep network models for accurate recognition of texts in scenes |
title_full |
Training deep network models for accurate recognition of texts in scenes |
title_fullStr |
Training deep network models for accurate recognition of texts in scenes |
title_full_unstemmed |
Training deep network models for accurate recognition of texts in scenes |
title_sort |
training deep network models for accurate recognition of texts in scenes |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/147992 |
_version_ |
1698713670974963712 |