DEVELOPMENT OF AN OCR MODULE FOR READING KK AND A SYNTHETIC DATA GENERATION MODULE FOR KTP, SIM, AND KK

Data digitalization has become a crucial topic, especially in the current modern era. Manual document processing requires more time and effort compared to automated processing utilizing digital data. In the context of document digitalization, such as ID cards (KTP), driver's licenses (SIM),...

Full description

Saved in:

Bibliographic Details
Main Author:	Riemenn, Louis
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/78168
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:78168
spelling	id-itb.:781682023-09-18T10:28:14ZDEVELOPMENT OF AN OCR MODULE FOR READING KK AND A SYNTHETIC DATA GENERATION MODULE FOR KTP, SIM, AND KK Riemenn, Louis Indonesia Final Project optical character recognition, synthetic data, text detection, text recognition, document, KK INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/78168 Data digitalization has become a crucial topic, especially in the current modern era. Manual document processing requires more time and effort compared to automated processing utilizing digital data. In the context of document digitalization, such as ID cards (KTP), driver's licenses (SIM), and family cards (KK), optical character recognition is employed to transform text in images into a digital format. This digital format can then be further processed digitally. In this final project, modules for generating synthetic data for KTP, SIM, and KK; text detection for KK documents; and text recognition for KK documents were developed. Synthetic data generation was accomplished using synthetic composites techniques, involving original images that were modified by adding synthetic elements that were previously absent. The original images (images from each KTP, SIM, and KK document) were modified by removing specific information in the images. False information was then added to the modified images, followed by introducing noise and tilting. The text detection and text recognition modules were built in three stages. The first stage involved selecting the most suitable model through benchmarking. The second stage consisted of training the chosen model using a KK dataset. The final stage was model evaluation to ensure that the model exhibited good performance and improvement compared to the previous state. Based on the conducted benchmarking, the selected model for text detection was DB with the MobileNetV3 backbone. Meanwhile, the chosen model for text recognition was SVTR with the SVTR-Tiny backbone. The selected models also demonstrated improved performance after training with the KK dataset, with the text detection model achieving precision of 97.80%, recall of 97.29%, and an F1 score of 97.54%, while the text recognition model achieved an accuracy of 99.99%. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Data digitalization has become a crucial topic, especially in the current modern era. Manual document processing requires more time and effort compared to automated processing utilizing digital data. In the context of document digitalization, such as ID cards (KTP), driver's licenses (SIM), and family cards (KK), optical character recognition is employed to transform text in images into a digital format. This digital format can then be further processed digitally. In this final project, modules for generating synthetic data for KTP, SIM, and KK; text detection for KK documents; and text recognition for KK documents were developed. Synthetic data generation was accomplished using synthetic composites techniques, involving original images that were modified by adding synthetic elements that were previously absent. The original images (images from each KTP, SIM, and KK document) were modified by removing specific information in the images. False information was then added to the modified images, followed by introducing noise and tilting. The text detection and text recognition modules were built in three stages. The first stage involved selecting the most suitable model through benchmarking. The second stage consisted of training the chosen model using a KK dataset. The final stage was model evaluation to ensure that the model exhibited good performance and improvement compared to the previous state. Based on the conducted benchmarking, the selected model for text detection was DB with the MobileNetV3 backbone. Meanwhile, the chosen model for text recognition was SVTR with the SVTR-Tiny backbone. The selected models also demonstrated improved performance after training with the KK dataset, with the text detection model achieving precision of 97.80%, recall of 97.29%, and an F1 score of 97.54%, while the text recognition model achieved an accuracy of 99.99%.
format	Final Project
author	Riemenn, Louis
spellingShingle	Riemenn, Louis DEVELOPMENT OF AN OCR MODULE FOR READING KK AND A SYNTHETIC DATA GENERATION MODULE FOR KTP, SIM, AND KK
author_facet	Riemenn, Louis
author_sort	Riemenn, Louis
title	DEVELOPMENT OF AN OCR MODULE FOR READING KK AND A SYNTHETIC DATA GENERATION MODULE FOR KTP, SIM, AND KK
title_short	DEVELOPMENT OF AN OCR MODULE FOR READING KK AND A SYNTHETIC DATA GENERATION MODULE FOR KTP, SIM, AND KK
title_full	DEVELOPMENT OF AN OCR MODULE FOR READING KK AND A SYNTHETIC DATA GENERATION MODULE FOR KTP, SIM, AND KK
title_fullStr	DEVELOPMENT OF AN OCR MODULE FOR READING KK AND A SYNTHETIC DATA GENERATION MODULE FOR KTP, SIM, AND KK
title_full_unstemmed	DEVELOPMENT OF AN OCR MODULE FOR READING KK AND A SYNTHETIC DATA GENERATION MODULE FOR KTP, SIM, AND KK
title_sort	development of an ocr module for reading kk and a synthetic data generation module for ktp, sim, and kk
url	https://digilib.itb.ac.id/gdl/view/78168
_version_	1822995648429948928

DEVELOPMENT OF AN OCR MODULE FOR READING KK AND A SYNTHETIC DATA GENERATION MODULE FOR KTP, SIM, AND KK

Similar Items