Arabic web page clustering: a review

Clustering is the method employed to group Web pages containing related information into clusters, which facilitates the allocation of relevant information. Clustering performance is mostly dependent on the text features' characteristics. The Arabic language has a complex morphology and is high...

Full description

Saved in:
Bibliographic Details
Main Authors: Alghamdi, H. M., Selamat, A.
Format: Article
Language:English
Published: King Saud bin Abdulaziz University 2017
Subjects:
Online Access:http://eprints.utm.my/id/eprint/77231/1/ASelamat2017_ArabicWebpageclusteringareview.pdf
http://eprints.utm.my/id/eprint/77231/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85021224854&doi=10.1016%2fj.jksuci.2017.06.002&partnerID=40&md5=8ef41df32ef52c5a8531f4017d77b52d
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
id my.utm.77231
record_format eprints
spelling my.utm.772312018-05-31T09:53:26Z http://eprints.utm.my/id/eprint/77231/ Arabic web page clustering: a review Alghamdi, H. M. Selamat, A. QA75 Electronic computers. Computer science Clustering is the method employed to group Web pages containing related information into clusters, which facilitates the allocation of relevant information. Clustering performance is mostly dependent on the text features' characteristics. The Arabic language has a complex morphology and is highly inflected. Thus, selecting appropriate features affects clustering performance positively. Many studies have addressed the clustering problem in Web pages with Arabic content. There are three main challenges in applying text clustering to Arabic Web page content. The first challenge concerns difficulty with identifying significant term features to represent original content by considering the hidden knowledge. The second challenge is related to reducing data dimensionality without losing essential information. The third challenge regards how to design a suitable model for clustering Arabic text that is capable of improving clustering performance. This paper presents an overview of existing Arabic Web page clustering methods, with the goals of clarifying existing problems and examining feature selection and reduction techniques for solving clustering difficulties. In line with the objectives and scope of this study, the present research is a joint effort to improve feature selection and vectorization frameworks in order to enhance current text analysis techniques that can be applied to Arabic Web pages. King Saud bin Abdulaziz University 2017 Article PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/77231/1/ASelamat2017_ArabicWebpageclusteringareview.pdf Alghamdi, H. M. and Selamat, A. (2017) Arabic web page clustering: a review. Journal of King Saud University - Computer and Information Sciences . ISSN 1319-1578 (In Press) https://www.scopus.com/inward/record.uri?eid=2-s2.0-85021224854&doi=10.1016%2fj.jksuci.2017.06.002&partnerID=40&md5=8ef41df32ef52c5a8531f4017d77b52d DOI:10.1016/j.jksuci.2017.06.002
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Alghamdi, H. M.
Selamat, A.
Arabic web page clustering: a review
description Clustering is the method employed to group Web pages containing related information into clusters, which facilitates the allocation of relevant information. Clustering performance is mostly dependent on the text features' characteristics. The Arabic language has a complex morphology and is highly inflected. Thus, selecting appropriate features affects clustering performance positively. Many studies have addressed the clustering problem in Web pages with Arabic content. There are three main challenges in applying text clustering to Arabic Web page content. The first challenge concerns difficulty with identifying significant term features to represent original content by considering the hidden knowledge. The second challenge is related to reducing data dimensionality without losing essential information. The third challenge regards how to design a suitable model for clustering Arabic text that is capable of improving clustering performance. This paper presents an overview of existing Arabic Web page clustering methods, with the goals of clarifying existing problems and examining feature selection and reduction techniques for solving clustering difficulties. In line with the objectives and scope of this study, the present research is a joint effort to improve feature selection and vectorization frameworks in order to enhance current text analysis techniques that can be applied to Arabic Web pages.
format Article
author Alghamdi, H. M.
Selamat, A.
author_facet Alghamdi, H. M.
Selamat, A.
author_sort Alghamdi, H. M.
title Arabic web page clustering: a review
title_short Arabic web page clustering: a review
title_full Arabic web page clustering: a review
title_fullStr Arabic web page clustering: a review
title_full_unstemmed Arabic web page clustering: a review
title_sort arabic web page clustering: a review
publisher King Saud bin Abdulaziz University
publishDate 2017
url http://eprints.utm.my/id/eprint/77231/1/ASelamat2017_ArabicWebpageclusteringareview.pdf
http://eprints.utm.my/id/eprint/77231/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85021224854&doi=10.1016%2fj.jksuci.2017.06.002&partnerID=40&md5=8ef41df32ef52c5a8531f4017d77b52d
_version_ 1643657535929974784