Knowledge based semantic representation for semantic relatedness measurements
Textual analysis has become one of the most important tasks due to the rapid increase in the number of texts. The text has been continuously generated in a variety of formats, including social media postings and chats, emails, articles, and news. The handling of these texts necessitates efficient an...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/37679/1/ir.Knowledge%20based%20semantic%20representation%20for%20semantic%20relatedness%20measurements.pdf http://umpir.ump.edu.my/id/eprint/37679/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaysia Pahang |
Language: | English |
id |
my.ump.umpir.37679 |
---|---|
record_format |
eprints |
spelling |
my.ump.umpir.376792023-09-18T07:52:08Z http://umpir.ump.edu.my/id/eprint/37679/ Knowledge based semantic representation for semantic relatedness measurements Ali Muttaleb, Hasan QA75 Electronic computers. Computer science T Technology (General) Textual analysis has become one of the most important tasks due to the rapid increase in the number of texts. The text has been continuously generated in a variety of formats, including social media postings and chats, emails, articles, and news. The handling of these texts necessitates efficient and effective procedures capable of dealing with linguistic challenges arising from natural language complexity. In recent years, there has been a lot of research into using semantic characteristics from lexical sources to deal with synonymy and ambiguity difficulties in text mining tasks like document clustering and classification. The main challenges of exploiting the lexical knowledge sources included WordNet in how to incorporate the different types of semantic relations for capturing more semantic evidence and how to handle the high dimensionality of the current semantic representation approaches. The research proposes a new knowledge-based semantic representation approach for semantic relatedness measurements. The weighting-based method for incorporating the semantic relations in the lexical sources is proposed to form the representation vector of the word. The proposed approach depends on the topological parameters (depth, density, descendants, and ancestors) in the semantic taxonomy. To handle the high dimensionality issue in the weighting-based method, a new topic-based technique is introduced to represent the semantics of words in terms of topics instead of the concepts in the weighting-based method. This proposed approach depends on the semantic features in the lexical sources (such as WordNet) for handling the synonymy and ambiguity issues. The proposed approach is evaluated for semantic relatedness measurements using six gold standard test sets. The evaluation results in terms of correlation measures demonstrate that the weighting-based method is more effective than the state-of-the-art feature-based methods. For the sample's harmonic measure to be accurate, the most anomalous value of r and p is calculated using the measure of the mean for each dataset, the proposed r and p methods are MC30, RG65, WordRel353, MT287, MEN3000, and Rgnew65 r 0.82, 0.86, 0.52, 0.53, 0.89, and 0.47, also for p 0.80, 0.82, 0.52, 0.47, 0.82, and 0.45. The results of the measurements indicated from the datasets are measures of the standard Means, thus the results of measurements of the proposed approach are 0.81, 0.84, 0.46, 0.49, 0.52, and 0.86 for MC30, RG65, WordRel353, MT287, MEN3000, and Rgnew65, respectively. The Non-zero is utilised to assess the proposed approach in order to ascertain the percentage of word pairings with a semantic relatedness value larger than zero. Using MC30, RG65, WordRel35, MT287, MEN3000, and Rgnew65, the NZ attained in the experiments was 0.96, 0.95, 0.95, 0.87, 0.95, and 0.95, respectively. 2022-10 Thesis NonPeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/37679/1/ir.Knowledge%20based%20semantic%20representation%20for%20semantic%20relatedness%20measurements.pdf Ali Muttaleb, Hasan (2022) Knowledge based semantic representation for semantic relatedness measurements. PhD thesis, Universiti Malaysia Pahang (Contributors, Thesis advisor: Taha Hussein, Alaaldeen Rassem). |
institution |
Universiti Malaysia Pahang |
building |
UMP Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Pahang |
content_source |
UMP Institutional Repository |
url_provider |
http://umpir.ump.edu.my/ |
language |
English |
topic |
QA75 Electronic computers. Computer science T Technology (General) |
spellingShingle |
QA75 Electronic computers. Computer science T Technology (General) Ali Muttaleb, Hasan Knowledge based semantic representation for semantic relatedness measurements |
description |
Textual analysis has become one of the most important tasks due to the rapid increase in the number of texts. The text has been continuously generated in a variety of formats, including social media postings and chats, emails, articles, and news. The handling of these texts necessitates efficient and effective procedures capable of dealing with linguistic challenges arising from natural language complexity. In recent years, there has been a lot of research into using semantic characteristics from lexical sources to deal with synonymy and ambiguity difficulties in text mining tasks like document clustering and classification. The main challenges of exploiting the lexical knowledge sources included WordNet in how to incorporate the different types of semantic relations for capturing more semantic evidence and how to handle the high dimensionality of the current semantic representation approaches. The research proposes a new knowledge-based semantic representation approach for semantic relatedness measurements. The weighting-based method for incorporating the semantic relations in the lexical sources is proposed to form the representation vector of the word. The proposed approach depends on the topological parameters (depth, density, descendants, and ancestors) in the semantic taxonomy. To handle the high dimensionality issue in the weighting-based method, a new topic-based technique is introduced to represent the semantics of words in terms of topics instead of the concepts in the weighting-based method. This proposed approach depends on the semantic features in the lexical sources (such as WordNet) for handling the synonymy and ambiguity issues. The proposed approach is evaluated for semantic relatedness measurements using six gold standard test sets. The evaluation results in terms of correlation measures demonstrate that the weighting-based method is more effective than the state-of-the-art feature-based methods. For the sample's harmonic measure to be accurate, the most anomalous value of r and p is calculated using the measure of the mean for each dataset, the proposed r and p methods are MC30, RG65, WordRel353, MT287, MEN3000, and Rgnew65 r 0.82, 0.86, 0.52, 0.53, 0.89, and 0.47, also for p 0.80, 0.82, 0.52, 0.47, 0.82, and 0.45. The results of the measurements indicated from the datasets are measures of the standard Means, thus the results of measurements of the proposed approach are 0.81, 0.84, 0.46, 0.49, 0.52, and 0.86 for MC30, RG65, WordRel353, MT287, MEN3000, and Rgnew65, respectively. The Non-zero is utilised to assess the proposed approach in order to ascertain the percentage of word pairings with a semantic relatedness value larger than zero. Using MC30, RG65, WordRel35, MT287, MEN3000, and Rgnew65, the NZ attained in the experiments was 0.96, 0.95, 0.95, 0.87, 0.95, and 0.95, respectively. |
format |
Thesis |
author |
Ali Muttaleb, Hasan |
author_facet |
Ali Muttaleb, Hasan |
author_sort |
Ali Muttaleb, Hasan |
title |
Knowledge based semantic representation for semantic relatedness measurements |
title_short |
Knowledge based semantic representation for semantic relatedness measurements |
title_full |
Knowledge based semantic representation for semantic relatedness measurements |
title_fullStr |
Knowledge based semantic representation for semantic relatedness measurements |
title_full_unstemmed |
Knowledge based semantic representation for semantic relatedness measurements |
title_sort |
knowledge based semantic representation for semantic relatedness measurements |
publishDate |
2022 |
url |
http://umpir.ump.edu.my/id/eprint/37679/1/ir.Knowledge%20based%20semantic%20representation%20for%20semantic%20relatedness%20measurements.pdf http://umpir.ump.edu.my/id/eprint/37679/ |
_version_ |
1778161082093797376 |