Recognizing texts on maps
Current popular optical character recognition(OCR) models struggle to achieve accurate results in both text positioning and text labelling prediction due to the complex, diverse and noisy nature of historical maps. Oftentimes, text-bounding polygons in historical maps intersect each other, and te...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175181 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-175181 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1751812024-04-19T15:42:31Z Recognizing texts on maps Tan, Pheng Khai Li Boyang School of Computer Science and Engineering boyang.li@ntu.edu.sg Computer and Information Science Current popular optical character recognition(OCR) models struggle to achieve accurate results in both text positioning and text labelling prediction due to the complex, diverse and noisy nature of historical maps. Oftentimes, text-bounding polygons in historical maps intersect each other, and texts are overlaid on top of surrounding geographical features, which add unnecessary noise and difficulties to the task of recognizing texts from maps. This paper presents a method to automatically generate large amounts of annotated training data by using CycleGAN to generate synthetic historical maps. The generated data is then used to train a DPText-DETR model, a model selected due to its distinct feature that gives it the potential to excel at the task required for this project. A pipeline is then proposed to be implemented to make historical map OCR more accessible and user-friendly. In this paper, thorough analysis and evaluation have been conducted on the proposed method, comparing it against a baseline model that adequately represents the pros and cons present in modern OCR tools, as well as a state-of-the-art model. Our evaluations show that this approach not only simplifies the generation of annotated training data but also significantly enhances OCR accuracy. Comparative performance assessments reveal that our model achieves a precision increase of 4.25 %, recall by 2.58%, and overall F1 improvement by 3.23% over baseline and state-of-the-art models in terms of Wolff’s metric, setting a new benchmark for historical map text recognition. Bachelor's degree 2024-04-19T12:11:31Z 2024-04-19T12:11:31Z 2024 Final Year Project (FYP) Tan, P. K. (2024). Recognizing texts on maps. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175181 https://hdl.handle.net/10356/175181 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science |
spellingShingle |
Computer and Information Science Tan, Pheng Khai Recognizing texts on maps |
description |
Current popular optical character recognition(OCR) models struggle to achieve accurate
results in both text positioning and text labelling prediction due to the complex, diverse and
noisy nature of historical maps. Oftentimes, text-bounding polygons in historical maps
intersect each other, and texts are overlaid on top of surrounding geographical features, which
add unnecessary noise and difficulties to the task of recognizing texts from maps.
This paper presents a method to automatically generate large amounts of annotated training
data by using CycleGAN to generate synthetic historical maps. The generated data is then
used to train a DPText-DETR model, a model selected due to its distinct feature that gives it
the potential to excel at the task required for this project. A pipeline is then proposed to be
implemented to make historical map OCR more accessible and user-friendly.
In this paper, thorough analysis and evaluation have been conducted on the proposed method,
comparing it against a baseline model that adequately represents the pros and cons present in
modern OCR tools, as well as a state-of-the-art model. Our evaluations show that this
approach not only simplifies the generation of annotated training data but also significantly
enhances OCR accuracy. Comparative performance assessments reveal that our model
achieves a precision increase of 4.25 %, recall by 2.58%, and overall F1 improvement by
3.23% over baseline and state-of-the-art models in terms of Wolff’s metric, setting a new
benchmark for historical map text recognition. |
author2 |
Li Boyang |
author_facet |
Li Boyang Tan, Pheng Khai |
format |
Final Year Project |
author |
Tan, Pheng Khai |
author_sort |
Tan, Pheng Khai |
title |
Recognizing texts on maps |
title_short |
Recognizing texts on maps |
title_full |
Recognizing texts on maps |
title_fullStr |
Recognizing texts on maps |
title_full_unstemmed |
Recognizing texts on maps |
title_sort |
recognizing texts on maps |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/175181 |
_version_ |
1800916223267962880 |