DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION

Digital document archiving overcomes the limitations of physical document quality and facilitates information processing. The digitalization process can be aided by Optical Character Recognition (OCR) systems. However, integrated OCR software for formal documents, particularly bank statements, co...

Full description

Saved in:
Bibliographic Details
Main Author: Aisha Geubrina Yasmin, Syarifah
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/78172
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:78172
spelling id-itb.:781722023-09-18T10:32:09ZDEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION Aisha Geubrina Yasmin, Syarifah Indonesia Final Project Archiving, digital, OCR, bank statement, internet access, text detection, text recognition, NER tagging, backend system INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/78172 Digital document archiving overcomes the limitations of physical document quality and facilitates information processing. The digitalization process can be aided by Optical Character Recognition (OCR) systems. However, integrated OCR software for formal documents, particularly bank statements, considering internet access in Indonesia, has not been widely developed. Therefore, a formal bank statement document understanding system is needed to support this. The formal bank statement document understanding system comprises 5 stages: image preprocessing, text detection, re-alignment, text recognition, and NER tagging to extract essential information from bank statements. The models used for text detection, text recognition, and NER tagging are PP-OCRv3 with an F1/3 Score of 93.8%, SVTR with a CER of 5.629%, and spaCy's NER model with an accuracy of 100% (BCA) and 99% (BNI). These models are generated through retraining on pre-trained models using synthesized bank statement data. Performance testing for each model is based on evaluation metrics specific to each model, as well as size and inference time. Additionally, in an effort to minimize internet usage, the strategy employed is the implementation of a backend system in the form of an API using the Flask framework. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Digital document archiving overcomes the limitations of physical document quality and facilitates information processing. The digitalization process can be aided by Optical Character Recognition (OCR) systems. However, integrated OCR software for formal documents, particularly bank statements, considering internet access in Indonesia, has not been widely developed. Therefore, a formal bank statement document understanding system is needed to support this. The formal bank statement document understanding system comprises 5 stages: image preprocessing, text detection, re-alignment, text recognition, and NER tagging to extract essential information from bank statements. The models used for text detection, text recognition, and NER tagging are PP-OCRv3 with an F1/3 Score of 93.8%, SVTR with a CER of 5.629%, and spaCy's NER model with an accuracy of 100% (BCA) and 99% (BNI). These models are generated through retraining on pre-trained models using synthesized bank statement data. Performance testing for each model is based on evaluation metrics specific to each model, as well as size and inference time. Additionally, in an effort to minimize internet usage, the strategy employed is the implementation of a backend system in the form of an API using the Flask framework.
format Final Project
author Aisha Geubrina Yasmin, Syarifah
spellingShingle Aisha Geubrina Yasmin, Syarifah
DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION
author_facet Aisha Geubrina Yasmin, Syarifah
author_sort Aisha Geubrina Yasmin, Syarifah
title DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION
title_short DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION
title_full DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION
title_fullStr DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION
title_full_unstemmed DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION
title_sort development of formal bank statement document understanding system with optical character recognition models for mobile application
url https://digilib.itb.ac.id/gdl/view/78172
_version_ 1822008503776051200