Offline text-independent Chinese writer identification method with two-tier image retrieval

Writer identification is essential today to identify the authenticity of a document in forensic expert decision-making. However, handwriting in various languages specifically Chinese poses a different challenge in identifying the writer. The main challenge faced by current researchers is that they f...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Gloria Jennis
Format: Thesis
Language:English
Published: 2019
Subjects:
Online Access:http://eprints.utm.my/id/eprint/96211/1/TanGloriaJennisPSC2019.pdf.pdf
http://eprints.utm.my/id/eprint/96211/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:142139
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
Description
Summary:Writer identification is essential today to identify the authenticity of a document in forensic expert decision-making. However, handwriting in various languages specifically Chinese poses a different challenge in identifying the writer. The main challenge faced by current researchers is that they fail to adopt traditional methods over an offline text independent Chinese writer identification scheme due to the complexity of Chinese writing structure and style. Furthermore, the previous method relies heavily on the selection of window size, which causes an ambiguity and leads to inconsistent results if the previous method is applied on a large image repository while finding the best-matched document from the database. Thus, much uncertainty still exists about the insurmountable searching space and the method has failed to show the effectiveness in searching relevant documents from a large image repository. This research attempted to solve problems by developing a new identification scheme for offline text-independent Chinese writer identification with the enhancement of feature extraction method and two-tier image retrieval mechanism to reduce search space and increase identification rates. The technique involved three essential steps. Firstly, the first-tier phase used Slantlet Transform based Local Binary Pattern (SLT-LBP) to bring out fine details. Then, sixty matching handwriting images were short-listed for the second-tier phase using Hierarchical Centroid (HC) of image pixels method for feature extraction. Finally, thirty shortlisted images were used as the input in the identification phase using Gray-Level Difference Method (GLDM) features. Experiment results had remarkably improved as compared to the previous method and the increase was from 95.4% to 96.68% in terms of identification rate as reported in the HIT-MW dataset. The contribution of this study is that it highlights the importance of using a two-tier retrieval mechanism to reduce search space in a large database in order to achieve higher accuracy. Besides, the development of a size-independent writer identification mechanism is a novelty as it can corroborate real-world application.