DEVELOPMENT OF MODEL ARCHITECTURE IN OPTICAL CHARACTER RECOGNITION APPLICATION FOR CREDIT CARD AND HANDWRITTEN DOCUMENTS
Transactions done with credit card as payment method are increasing in terms of volume. However, long and complicated process makes many users find it difficult completing these transactions. Discomfort can also be found in lots of processes involving document done by public instances, for exampl...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/76575 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Transactions done with credit card as payment method are increasing in terms of volume.
However, long and complicated process makes many users find it difficult completing these
transactions. Discomfort can also be found in lots of processes involving document done by
public instances, for example university certificates and handwritten documents. Problems
regarding limited internet accessibility in Indonesia should also be noted. To help automatize
these processes, an optical character recognition application, best known as OCR, is
implemented for mobile devices with offline usage in mind.
The software architecture used in the development of this application is by first detecting the
texts in the image provided with a text detector known as CRAFT. Image segments will then
be inputted to text recognition module that is built with TransformerOCR model. The entities
in the translated texts will then be detected and further classified with NER model made with
SpaCy. The models, from benchmarking processes, will be converted to TFLite format in order
to embed the model to users’ mobile devices for offline usage.
However, experiment to convert these models resulted in failure for all of the models. This
failure changes the architecture and approach used to then implement the end-to-end flow of
the OCR with client-server approach. These OCR logics is implemented as a backend
application that is developed with FastAPI and deployed with AWS EC2 as its infrastructure. |
---|