Scene text recognition

Scene text recognition problem has recently seen interest within the deep learning community. Solving such a problem will inevitably open paths to more exciting inventions in the future such as for robotic navigation. However, the current solutions are far from perfect and there is potentially more...

Full description

Saved in:

Bibliographic Details
Main Author:	Muhammad Afiq Osman
Other Authors:	Lu Shijian
Format:	Final Year Project
Language:	English
Published:	2019
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	http://hdl.handle.net/10356/76861
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-76861
record_format	dspace
spelling	sg-ntu-dr.10356-768612023-03-03T20:54:27Z Scene text recognition Muhammad Afiq Osman Lu Shijian School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering Scene text recognition problem has recently seen interest within the deep learning community. Solving such a problem will inevitably open paths to more exciting inventions in the future such as for robotic navigation. However, the current solutions are far from perfect and there is potentially more that could be done. In this FYP we will investigate and attempt to replicate an existing study regarding scene text recognition. The goal is to theoretically understand and experience the practical side of implementing and training a deep learning model to tackle such a problem. We would undertake the hyperparameter tuning process in search of the optimal values for the batch size, learning rate, number of epochs and the learning optimizer. Optimal values found for batch size and learning rate coincides with the common rationale. The same goes for the number of epochs where the resulting trend suggests that as the number of epochs increases the model accuracy will start to plateau. However, as for the best optimizer was found to be Adam which was different from the original’s study optimizer of Adadelta. Adadelta in fact performed much worse producing ‘nan’ test and train error on many occasions. Future recommendation for this FYP includes experimenting with the CRNN model structure used in order to deeply understand the effects of the CNN and RNN layers used. Bachelor of Engineering (Computer Science) 2019-04-20T05:28:28Z 2019-04-20T05:28:28Z 2019 Final Year Project (FYP) http://hdl.handle.net/10356/76861 en Nanyang Technological University 38 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Muhammad Afiq Osman Scene text recognition
description	Scene text recognition problem has recently seen interest within the deep learning community. Solving such a problem will inevitably open paths to more exciting inventions in the future such as for robotic navigation. However, the current solutions are far from perfect and there is potentially more that could be done. In this FYP we will investigate and attempt to replicate an existing study regarding scene text recognition. The goal is to theoretically understand and experience the practical side of implementing and training a deep learning model to tackle such a problem. We would undertake the hyperparameter tuning process in search of the optimal values for the batch size, learning rate, number of epochs and the learning optimizer. Optimal values found for batch size and learning rate coincides with the common rationale. The same goes for the number of epochs where the resulting trend suggests that as the number of epochs increases the model accuracy will start to plateau. However, as for the best optimizer was found to be Adam which was different from the original’s study optimizer of Adadelta. Adadelta in fact performed much worse producing ‘nan’ test and train error on many occasions. Future recommendation for this FYP includes experimenting with the CRNN model structure used in order to deeply understand the effects of the CNN and RNN layers used.
author2	Lu Shijian
author_facet	Lu Shijian Muhammad Afiq Osman
format	Final Year Project
author	Muhammad Afiq Osman
author_sort	Muhammad Afiq Osman
title	Scene text recognition
title_short	Scene text recognition
title_full	Scene text recognition
title_fullStr	Scene text recognition
title_full_unstemmed	Scene text recognition
title_sort	scene text recognition
publishDate	2019
url	http://hdl.handle.net/10356/76861
_version_	1759858026404118528

Scene text recognition

Similar Items