DEVELOPMENT OF MODEL ARCHITECTURE IN OPTICAL CHARACTER RECOGNITION APPLICATION FOR CREDIT CARD AND HANDWRITTEN DOCUMENTS

Transactions done with credit card as payment method are increasing in terms of volume. However, long and complicated process makes many users find it difficult completing these transactions. Discomfort can also be found in lots of processes involving document done by public instances, for exampl...

Full description

Saved in:
Bibliographic Details
Main Author: Rivaldo, Richard
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/76575
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Transactions done with credit card as payment method are increasing in terms of volume. However, long and complicated process makes many users find it difficult completing these transactions. Discomfort can also be found in lots of processes involving document done by public instances, for example university certificates and handwritten documents. Problems regarding limited internet accessibility in Indonesia should also be noted. To help automatize these processes, an optical character recognition application, best known as OCR, is implemented for mobile devices with offline usage in mind. The software architecture used in the development of this application is by first detecting the texts in the image provided with a text detector known as CRAFT. Image segments will then be inputted to text recognition module that is built with TransformerOCR model. The entities in the translated texts will then be detected and further classified with NER model made with SpaCy. The models, from benchmarking processes, will be converted to TFLite format in order to embed the model to users’ mobile devices for offline usage. However, experiment to convert these models resulted in failure for all of the models. This failure changes the architecture and approach used to then implement the end-to-end flow of the OCR with client-server approach. These OCR logics is implemented as a backend application that is developed with FastAPI and deployed with AWS EC2 as its infrastructure.