Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form
The availability of knowledge discovery operation helps to extract valuable information and knowledge in large volumes of data in structured databases. However, a large portion of the available information is not in structured form but rather collections of text documents in unstructured format,...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2007
|
Subjects: | |
Online Access: | http://psasir.upm.edu.my/id/eprint/5887/1/FSKTM_2007_10%20IR.pdf http://psasir.upm.edu.my/id/eprint/5887/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Putra Malaysia |
Language: | English |
id |
my.upm.eprints.5887 |
---|---|
record_format |
eprints |
spelling |
my.upm.eprints.58872022-01-20T07:29:51Z http://psasir.upm.edu.my/id/eprint/5887/ Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form Sidi, Fatimah The availability of knowledge discovery operation helps to extract valuable information and knowledge in large volumes of data in structured databases. However, a large portion of the available information is not in structured form but rather collections of text documents in unstructured format, which also implies to Malay unstructured documents. Therefore, structuring characteristics must be imposed to unstructured documents in order to transform information available in unstructured documents into knowledge. A new approach has been established to transform extracted knowledge in Malay unstructured document by identifying, organizing, and structuring them into interrogative structured form. Its architecture is developed based on the implementation of (i) interrogative knowledge identification; (ii) interrogative contextual information; and (iii) interrogative knowledge organization and structuring with Malay knowledge representation by concepts. It utilizes the Malay language corpus; interrogative theory; as well as object-oriented, ontology, and database model. The research involves system development based on architecture of the MalaylK-Ontology, which is being measured by quantitative retrieval performance using the recall and precision metrics. The development of the Retrieval lnterrogative Ontology Analysis Application is used to verify fitness of task for the functionalities and usefulness on the utilization of interrogative contextual information with color coding supplement, additional information annotation, and Malay knowledge representation by concepts. A number of experiments are carried out to quantify the accuracy of knowledge extracted. The MalaylK-Ontology is tested by using stratified random sampling drawn from various sources of Malay unstructured documents such as news, e-mails, articles, magazines, and texts from children story books. The results of the experiments have proved that the approach of MalaylK-Ontology performed well as compared to knowledge extracted manually done by an expert. The results of questionnaires evaluation on the Retrieval lnterrogative Ontology Analysis Application have shown good achievement in understanding the main point of the unstructured document easily and clearly. This is to improve better understanding the process of making sense of information into knowledge, maintaining the meaning of the information and gaining the interpretation of the identical knowledge in unstructured document which facilitate identical knowledge perceived by different people. 2007-09 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/5887/1/FSKTM_2007_10%20IR.pdf Sidi, Fatimah (2007) Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form. Doctoral thesis, Universiti Putra Malaysia. Knowledge acquisition (Expert systems) Databases |
institution |
Universiti Putra Malaysia |
building |
UPM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Putra Malaysia |
content_source |
UPM Institutional Repository |
url_provider |
http://psasir.upm.edu.my/ |
language |
English |
topic |
Knowledge acquisition (Expert systems) Databases |
spellingShingle |
Knowledge acquisition (Expert systems) Databases Sidi, Fatimah Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form |
description |
The availability of knowledge discovery operation helps to extract valuable
information and knowledge in large volumes of data in structured databases.
However, a large portion of the available information is not in structured form
but rather collections of text documents in unstructured format, which also
implies to Malay unstructured documents. Therefore, structuring
characteristics must be imposed to unstructured documents in order to
transform information available in unstructured documents into knowledge. A new approach has been established to transform extracted knowledge in
Malay unstructured document by identifying, organizing, and structuring them
into interrogative structured form. Its architecture is developed based on the
implementation of (i) interrogative knowledge identification; (ii) interrogative
contextual information; and (iii) interrogative knowledge organization and structuring with Malay knowledge representation by concepts. It utilizes the
Malay language corpus; interrogative theory; as well as object-oriented,
ontology, and database model. The research involves system development
based on architecture of the MalaylK-Ontology, which is being measured by
quantitative retrieval performance using the recall and precision metrics. The
development of the Retrieval lnterrogative Ontology Analysis Application is
used to verify fitness of task for the functionalities and usefulness on the
utilization of interrogative contextual information with color coding
supplement, additional information annotation, and Malay knowledge
representation by concepts. A number of experiments are carried out to
quantify the accuracy of knowledge extracted. The MalaylK-Ontology is
tested by using stratified random sampling drawn from various sources of
Malay unstructured documents such as news, e-mails, articles, magazines,
and texts from children story books. The results of the experiments have
proved that the approach of MalaylK-Ontology performed well as compared
to knowledge extracted manually done by an expert. The results of
questionnaires evaluation on the Retrieval lnterrogative Ontology Analysis
Application have shown good achievement in understanding the main point
of the unstructured document easily and clearly. This is to improve better
understanding the process of making sense of information into knowledge,
maintaining the meaning of the information and gaining the interpretation of
the identical knowledge in unstructured document which facilitate identical
knowledge perceived by different people. |
format |
Thesis |
author |
Sidi, Fatimah |
author_facet |
Sidi, Fatimah |
author_sort |
Sidi, Fatimah |
title |
Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form |
title_short |
Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form |
title_full |
Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form |
title_fullStr |
Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form |
title_full_unstemmed |
Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form |
title_sort |
transformation of extracted knowledge in malay unstructured documents into an interrogative structured form |
publishDate |
2007 |
url |
http://psasir.upm.edu.my/id/eprint/5887/1/FSKTM_2007_10%20IR.pdf http://psasir.upm.edu.my/id/eprint/5887/ |
_version_ |
1724075265509818368 |