Content extraction of historical Malay manuscripts based on event ontology framework

This article aims to explore representation of the content knowledge of historical Malay manuscripts by extracting the event features using an event ontology framework. The manuscript used during the testing is Sulalatus Salatin (Sejarah Melayu (SIC)) by Abdul Ahmad Samad and it was published at Uni...

Full description

Saved in:
Bibliographic Details
Main Authors: Zahila, M. N., Noorhidawati, Abdullah, Mohd Khalid, Yanti Idaya Aspura
Format: Article
Published: IOS Press 2021
Subjects:
Online Access:http://eprints.um.edu.my/26993/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaya
id my.um.eprints.26993
record_format eprints
spelling my.um.eprints.269932022-04-05T05:02:15Z http://eprints.um.edu.my/26993/ Content extraction of historical Malay manuscripts based on event ontology framework Zahila, M. N. Noorhidawati, Abdullah Mohd Khalid, Yanti Idaya Aspura QA75 Electronic computers. Computer science This article aims to explore representation of the content knowledge of historical Malay manuscripts by extracting the event features using an event ontology framework. The manuscript used during the testing is Sulalatus Salatin (Sejarah Melayu (SIC)) by Abdul Ahmad Samad and it was published at University of Malaya Digital Library database. In aligning to a domain-specific ontology, the Simple Event Model (SEM) model is adopted and an event-based ontology for historical Malay manuscripts is designed. Information extraction approach is done manually to extract events from the manuscript and mapped into Protege editor. Competency questions were constructed and submitted to the Protege editor using SPARQL to check the ontology capability of providing answers as well as to examine its correctness. Event-based ontology model assists in discovering and representing the content knowledge of historical Malay manuscripts and supports organisation of knowledge. All the main concepts are extracted from selected Malay manuscript and 17 concepts used to develop the event-based ontology model. The knowledge was verified by three domain experts in Malay manuscript. In the findings, the interrater reliability for Event and Actor instances is 84%, which means 16% of instances and its type are incorrect and need amendment. For Place, interrater reliability is 95% and 99% for Role. Meanwhile, the experts achieved 100% agreement for Time. In addition, the experts agreed that the concepts, properties and instances for Malay Manuscript Ontology and complied with the criteria of consistency, completeness, conciseness, expandability and ease of use. The development of the event-based model of an ontology-based system with a high level of semantic granularity reflects the various cultural riches and intellectual aspect stored in Malay manuscripts. This will enable systematic research of the knowledge embedded in the manuscripts and make it widely and easily accessible by everyone. IOS Press 2021 Article PeerReviewed Zahila, M. N. and Noorhidawati, Abdullah and Mohd Khalid, Yanti Idaya Aspura (2021) Content extraction of historical Malay manuscripts based on event ontology framework. Applied Ontology, 16 (3). pp. 249-275. ISSN 1570-5838, DOI https://doi.org/10.3233/AO-210247 <https://doi.org/10.3233/AO-210247>. 10.3233/AO-210247
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Research Repository
url_provider http://eprints.um.edu.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Zahila, M. N.
Noorhidawati, Abdullah
Mohd Khalid, Yanti Idaya Aspura
Content extraction of historical Malay manuscripts based on event ontology framework
description This article aims to explore representation of the content knowledge of historical Malay manuscripts by extracting the event features using an event ontology framework. The manuscript used during the testing is Sulalatus Salatin (Sejarah Melayu (SIC)) by Abdul Ahmad Samad and it was published at University of Malaya Digital Library database. In aligning to a domain-specific ontology, the Simple Event Model (SEM) model is adopted and an event-based ontology for historical Malay manuscripts is designed. Information extraction approach is done manually to extract events from the manuscript and mapped into Protege editor. Competency questions were constructed and submitted to the Protege editor using SPARQL to check the ontology capability of providing answers as well as to examine its correctness. Event-based ontology model assists in discovering and representing the content knowledge of historical Malay manuscripts and supports organisation of knowledge. All the main concepts are extracted from selected Malay manuscript and 17 concepts used to develop the event-based ontology model. The knowledge was verified by three domain experts in Malay manuscript. In the findings, the interrater reliability for Event and Actor instances is 84%, which means 16% of instances and its type are incorrect and need amendment. For Place, interrater reliability is 95% and 99% for Role. Meanwhile, the experts achieved 100% agreement for Time. In addition, the experts agreed that the concepts, properties and instances for Malay Manuscript Ontology and complied with the criteria of consistency, completeness, conciseness, expandability and ease of use. The development of the event-based model of an ontology-based system with a high level of semantic granularity reflects the various cultural riches and intellectual aspect stored in Malay manuscripts. This will enable systematic research of the knowledge embedded in the manuscripts and make it widely and easily accessible by everyone.
format Article
author Zahila, M. N.
Noorhidawati, Abdullah
Mohd Khalid, Yanti Idaya Aspura
author_facet Zahila, M. N.
Noorhidawati, Abdullah
Mohd Khalid, Yanti Idaya Aspura
author_sort Zahila, M. N.
title Content extraction of historical Malay manuscripts based on event ontology framework
title_short Content extraction of historical Malay manuscripts based on event ontology framework
title_full Content extraction of historical Malay manuscripts based on event ontology framework
title_fullStr Content extraction of historical Malay manuscripts based on event ontology framework
title_full_unstemmed Content extraction of historical Malay manuscripts based on event ontology framework
title_sort content extraction of historical malay manuscripts based on event ontology framework
publisher IOS Press
publishDate 2021
url http://eprints.um.edu.my/26993/
_version_ 1735409485475741696