Advanced stroke labelling technique based on directions features for Arabic character segmentation

Offline Character segmentation of text images is an important step in many document image analysis and recognition (DIAR) applications. However, the character segmentation of both writing styles (printed and handwritten) remains an open problem. Moreover, the manual segmentation is time-consuming an...

Full description

Saved in:
Bibliographic Details
Main Authors: Abu-Ain, Tarik Abdel-Kareem, Siti Norul Huda Sheikh Abdullah, Khairuddin Omar, Siti Zaharah Abd Rahman
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2019
Online Access:http://journalarticle.ukm.my/14154/1/26483-111116-1-PB.pdf
http://journalarticle.ukm.my/14154/
http://ejournals.ukm.my/apjitm/issue/view/1179
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Kebangsaan Malaysia
Language: English
id my-ukm.journal.14154
record_format eprints
spelling my-ukm.journal.141542020-02-07T11:10:28Z http://journalarticle.ukm.my/14154/ Advanced stroke labelling technique based on directions features for Arabic character segmentation Abu-Ain, Tarik Abdel-Kareem Siti Norul Huda Sheikh Abdullah, Khairuddin Omar, Siti Zaharah Abd Rahman, Offline Character segmentation of text images is an important step in many document image analysis and recognition (DIAR) applications. However, the character segmentation of both writing styles (printed and handwritten) remains an open problem. Moreover, the manual segmentation is time-consuming and impractical for large numbers of documents. Based on the unconstraint-cursive handwritten perspective, the auto character segmentation is more challenging and complex. The Arabic script writing style suffers from many common problems, such as sub-words overlapping, characters overlapping, and missed characters. These challenging issues have attracted the attention of researchers in the field of DIAR for Arabic character segmentation. The proposed method combines a new advanced Stroke Labelling based on Direction Features (SLDF2) technique and a modified vertical projection histogram (MVPH) technique. This technique extracts the relationship between each text stroke pixel and its 8 neighboring foreground pixels and labels it with the proper value before identify the possible segmentation points. The text preparation for the segmentation process was achieved using multiple preprocessing steps and developing an advanced stroke labelling technique based on direction features. Several Arabic language structural-rules were made to detect the candidate segmentation points (CSP), detect many character overlapping cases, solve the missed characters problem that appears as a result of using the text skeleton in VPH, and validate the CSP. All techniques and methods are tested on the ACDAR benchmark database. The validation method used to measure segmentation accuracy was a quantitative analysis that includes Recall, Precision, and F-measurement tests. The average accuracy of the proposed segmentation method was 92.44%, which outperforms the state-of-the-art method. Penerbit Universiti Kebangsaan Malaysia 2019-06 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/14154/1/26483-111116-1-PB.pdf Abu-Ain, Tarik Abdel-Kareem and Siti Norul Huda Sheikh Abdullah, and Khairuddin Omar, and Siti Zaharah Abd Rahman, (2019) Advanced stroke labelling technique based on directions features for Arabic character segmentation. Asia-Pacific Journal of Information Technology and Multimedia, 8 (1). pp. 97-127. ISSN 2289-2192 http://ejournals.ukm.my/apjitm/issue/view/1179
institution Universiti Kebangsaan Malaysia
building Tun Sri Lanang Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Kebangsaan Malaysia
content_source UKM Journal Article Repository
url_provider http://journalarticle.ukm.my/
language English
description Offline Character segmentation of text images is an important step in many document image analysis and recognition (DIAR) applications. However, the character segmentation of both writing styles (printed and handwritten) remains an open problem. Moreover, the manual segmentation is time-consuming and impractical for large numbers of documents. Based on the unconstraint-cursive handwritten perspective, the auto character segmentation is more challenging and complex. The Arabic script writing style suffers from many common problems, such as sub-words overlapping, characters overlapping, and missed characters. These challenging issues have attracted the attention of researchers in the field of DIAR for Arabic character segmentation. The proposed method combines a new advanced Stroke Labelling based on Direction Features (SLDF2) technique and a modified vertical projection histogram (MVPH) technique. This technique extracts the relationship between each text stroke pixel and its 8 neighboring foreground pixels and labels it with the proper value before identify the possible segmentation points. The text preparation for the segmentation process was achieved using multiple preprocessing steps and developing an advanced stroke labelling technique based on direction features. Several Arabic language structural-rules were made to detect the candidate segmentation points (CSP), detect many character overlapping cases, solve the missed characters problem that appears as a result of using the text skeleton in VPH, and validate the CSP. All techniques and methods are tested on the ACDAR benchmark database. The validation method used to measure segmentation accuracy was a quantitative analysis that includes Recall, Precision, and F-measurement tests. The average accuracy of the proposed segmentation method was 92.44%, which outperforms the state-of-the-art method.
format Article
author Abu-Ain, Tarik Abdel-Kareem
Siti Norul Huda Sheikh Abdullah,
Khairuddin Omar,
Siti Zaharah Abd Rahman,
spellingShingle Abu-Ain, Tarik Abdel-Kareem
Siti Norul Huda Sheikh Abdullah,
Khairuddin Omar,
Siti Zaharah Abd Rahman,
Advanced stroke labelling technique based on directions features for Arabic character segmentation
author_facet Abu-Ain, Tarik Abdel-Kareem
Siti Norul Huda Sheikh Abdullah,
Khairuddin Omar,
Siti Zaharah Abd Rahman,
author_sort Abu-Ain, Tarik Abdel-Kareem
title Advanced stroke labelling technique based on directions features for Arabic character segmentation
title_short Advanced stroke labelling technique based on directions features for Arabic character segmentation
title_full Advanced stroke labelling technique based on directions features for Arabic character segmentation
title_fullStr Advanced stroke labelling technique based on directions features for Arabic character segmentation
title_full_unstemmed Advanced stroke labelling technique based on directions features for Arabic character segmentation
title_sort advanced stroke labelling technique based on directions features for arabic character segmentation
publisher Penerbit Universiti Kebangsaan Malaysia
publishDate 2019
url http://journalarticle.ukm.my/14154/1/26483-111116-1-PB.pdf
http://journalarticle.ukm.my/14154/
http://ejournals.ukm.my/apjitm/issue/view/1179
_version_ 1662755929862438912