DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION
Digital document archiving overcomes the limitations of physical document quality and facilitates information processing. The digitalization process can be aided by Optical Character Recognition (OCR) systems. However, integrated OCR software for formal documents, particularly bank statements, co...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/78172 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:78172 |
---|---|
spelling |
id-itb.:781722023-09-18T10:32:09ZDEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION Aisha Geubrina Yasmin, Syarifah Indonesia Final Project Archiving, digital, OCR, bank statement, internet access, text detection, text recognition, NER tagging, backend system INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/78172 Digital document archiving overcomes the limitations of physical document quality and facilitates information processing. The digitalization process can be aided by Optical Character Recognition (OCR) systems. However, integrated OCR software for formal documents, particularly bank statements, considering internet access in Indonesia, has not been widely developed. Therefore, a formal bank statement document understanding system is needed to support this. The formal bank statement document understanding system comprises 5 stages: image preprocessing, text detection, re-alignment, text recognition, and NER tagging to extract essential information from bank statements. The models used for text detection, text recognition, and NER tagging are PP-OCRv3 with an F1/3 Score of 93.8%, SVTR with a CER of 5.629%, and spaCy's NER model with an accuracy of 100% (BCA) and 99% (BNI). These models are generated through retraining on pre-trained models using synthesized bank statement data. Performance testing for each model is based on evaluation metrics specific to each model, as well as size and inference time. Additionally, in an effort to minimize internet usage, the strategy employed is the implementation of a backend system in the form of an API using the Flask framework. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Digital document archiving overcomes the limitations of physical document quality
and facilitates information processing. The digitalization process can be aided by
Optical Character Recognition (OCR) systems. However, integrated OCR software
for formal documents, particularly bank statements, considering internet access in
Indonesia, has not been widely developed. Therefore, a formal bank statement
document understanding system is needed to support this. The formal bank
statement document understanding system comprises 5 stages: image
preprocessing, text detection, re-alignment, text recognition, and NER tagging to
extract essential information from bank statements. The models used for text
detection, text recognition, and NER tagging are PP-OCRv3 with an F1/3 Score of
93.8%, SVTR with a CER of 5.629%, and spaCy's NER model with an accuracy of
100% (BCA) and 99% (BNI). These models are generated through retraining on
pre-trained models using synthesized bank statement data. Performance testing for
each model is based on evaluation metrics specific to each model, as well as size
and inference time. Additionally, in an effort to minimize internet usage, the
strategy employed is the implementation of a backend system in the form of an API
using the Flask framework. |
format |
Final Project |
author |
Aisha Geubrina Yasmin, Syarifah |
spellingShingle |
Aisha Geubrina Yasmin, Syarifah DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION |
author_facet |
Aisha Geubrina Yasmin, Syarifah |
author_sort |
Aisha Geubrina Yasmin, Syarifah |
title |
DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION |
title_short |
DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION |
title_full |
DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION |
title_fullStr |
DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION |
title_full_unstemmed |
DEVELOPMENT OF FORMAL BANK STATEMENT DOCUMENT UNDERSTANDING SYSTEM WITH OPTICAL CHARACTER RECOGNITION MODELS FOR MOBILE APPLICATION |
title_sort |
development of formal bank statement document understanding system with optical character recognition models for mobile application |
url |
https://digilib.itb.ac.id/gdl/view/78172 |
_version_ |
1822008503776051200 |