Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik

Anaphora resolution (AR) is a process to resolve reference entity of pronoun anaphora. It is a phenomenon that occur in every languages and requires human experts or specific rules in order to resolve it. AR able to improve language processing applications such as question-answering, text mining, do...

Full description

Saved in:

Bibliographic Details
Main Author:	Noorhuzaimi@Karimah, Mohd Noor
Format:	Thesis
Language:	English
Published:	2016
Subjects:	PL Languages and literatures of Eastern Asia, Africa, Oceania
Online Access:	http://umpir.ump.edu.my/id/eprint/25341/1/Resolusi%20anafora%20artikel%20Bahasa%20Melayu%20berasaskan%20pengetahuan%20terhad.pdf http://umpir.ump.edu.my/id/eprint/25341/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Malaysia Pahang
Language:	English

id	my.ump.umpir.25341
record_format	eprints
spelling	my.ump.umpir.253412021-07-28T03:18:07Z http://umpir.ump.edu.my/id/eprint/25341/ Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik Noorhuzaimi@Karimah, Mohd Noor PL Languages and literatures of Eastern Asia, Africa, Oceania Anaphora resolution (AR) is a process to resolve reference entity of pronoun anaphora. It is a phenomenon that occur in every languages and requires human experts or specific rules in order to resolve it. AR able to improve language processing applications such as question-answering, text mining, document summarizations, and information extraction. There has been various research carried out on AR, but the majority of them were meant for languages such as English, Japanese and Norwegian. Very few and almost no research effort have been focussed on AR for Malay language. Therefore, the aim of this research is to resolve the phenomena of AR for Malay text by using knowledge poor approach and semantic class labelling model. In order to achieve the aim, a framework of the Malay AR has been developed as a guide to solve this phenomenon in Malay language. Meanwhile, the process to determine the type of usage for pronoun nya has been solved by using a set of rules, a set of similar words, and word filtering that has been generate from semantic class labelling model. This process is important because the use of pronoun nya in Malay text is the highest, amounting to 68% as compared to other pronouns that mostly depend on the sociological status of referring entity or antecedent. The antecedent candidate determination is an important process that should be considered. The antecedent candidates can be in the form of proper noun or nouns. In order to determine proper nouns as suitable candidates, two main processes need to be done: (1) the entity recognition for proper noun that has the word 'dan' and comma symbol (,); and (2) the process to determine the semantic label for each retrieved candidate in order to determine their sociological status. The research used part of the name gazetteers for people, organization, location and position. Testing has been conducted on 60 Malay articles with different classes of proper nouns. The results were compared with the benchmark data tagged by a Malay linguist. The result shows an average precision and recall values of 85% and 90% respectively. The proposed framework of AR by using knowledge poor approach for Malay text shows increased success rate by 18.79% as compared to the generic approach proposed by Mitkov and Lappin. 2016 Thesis NonPeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/25341/1/Resolusi%20anafora%20artikel%20Bahasa%20Melayu%20berasaskan%20pengetahuan%20terhad.pdf Noorhuzaimi@Karimah, Mohd Noor (2016) Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik. PhD thesis, Universiti Kebangsaan Malaysia.
institution	Universiti Malaysia Pahang
building	UMP Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaysia Pahang
content_source	UMP Institutional Repository
url_provider	http://umpir.ump.edu.my/
language	English
topic	PL Languages and literatures of Eastern Asia, Africa, Oceania
spellingShingle	PL Languages and literatures of Eastern Asia, Africa, Oceania Noorhuzaimi@Karimah, Mohd Noor Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
description	Anaphora resolution (AR) is a process to resolve reference entity of pronoun anaphora. It is a phenomenon that occur in every languages and requires human experts or specific rules in order to resolve it. AR able to improve language processing applications such as question-answering, text mining, document summarizations, and information extraction. There has been various research carried out on AR, but the majority of them were meant for languages such as English, Japanese and Norwegian. Very few and almost no research effort have been focussed on AR for Malay language. Therefore, the aim of this research is to resolve the phenomena of AR for Malay text by using knowledge poor approach and semantic class labelling model. In order to achieve the aim, a framework of the Malay AR has been developed as a guide to solve this phenomenon in Malay language. Meanwhile, the process to determine the type of usage for pronoun nya has been solved by using a set of rules, a set of similar words, and word filtering that has been generate from semantic class labelling model. This process is important because the use of pronoun nya in Malay text is the highest, amounting to 68% as compared to other pronouns that mostly depend on the sociological status of referring entity or antecedent. The antecedent candidate determination is an important process that should be considered. The antecedent candidates can be in the form of proper noun or nouns. In order to determine proper nouns as suitable candidates, two main processes need to be done: (1) the entity recognition for proper noun that has the word 'dan' and comma symbol (,); and (2) the process to determine the semantic label for each retrieved candidate in order to determine their sociological status. The research used part of the name gazetteers for people, organization, location and position. Testing has been conducted on 60 Malay articles with different classes of proper nouns. The results were compared with the benchmark data tagged by a Malay linguist. The result shows an average precision and recall values of 85% and 90% respectively. The proposed framework of AR by using knowledge poor approach for Malay text shows increased success rate by 18.79% as compared to the generic approach proposed by Mitkov and Lappin.
format	Thesis
author	Noorhuzaimi@Karimah, Mohd Noor
author_facet	Noorhuzaimi@Karimah, Mohd Noor
author_sort	Noorhuzaimi@Karimah, Mohd Noor
title	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_short	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_full	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_fullStr	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_full_unstemmed	Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik
title_sort	resolusi anafora artikel bahasa melayu berasaskan pengetahuan terhad dan kelas semantik
publishDate	2016
url	http://umpir.ump.edu.my/id/eprint/25341/1/Resolusi%20anafora%20artikel%20Bahasa%20Melayu%20berasaskan%20pengetahuan%20terhad.pdf http://umpir.ump.edu.my/id/eprint/25341/
_version_	1706957243648311296

Resolusi anafora artikel Bahasa Melayu berasaskan pengetahuan terhad dan kelas semantik

Similar Items