Improvement of Malay information retrieval using local stop words
This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted a...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2005
|
Online Access: | http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf http://psasir.upm.edu.my/id/eprint/38975/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Putra Malaysia |
Language: | English |
id |
my.upm.eprints.38975 |
---|---|
record_format |
eprints |
spelling |
my.upm.eprints.389752015-08-24T02:10:26Z http://psasir.upm.edu.my/id/eprint/38975/ Improvement of Malay information retrieval using local stop words Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted and ranked by frequency of occurrence in decreasing order, then the list of the 50 most frequently occurring words in the corpus are obtained. The evaluation of the new Malay stop words list is carried out on the Quranic collection. The employment of the new stop words in which the words from the Quran are compared with the stored stop words lists, results in a sum of 39.7% to 43.6% remaining terms. An experiment was done to evaluate the performance of this stop words lists in terms of the recall and precision for Malay information retrieval system. The results of the experiment without the stop words list and with using stop words lists shown that the employment of the new stop words lists had increased the average precision by 24.8% to 35.0%. The results demonstrate that this list can successfully be used in the Malay information retrieval system. 2005 Conference or Workshop Item NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf Abdullah, Muhamad Taufik and Ahmad, Fatimah and Mahmod, Ramlan and Tengku Sembok, Tengku Mohd (2005) Improvement of Malay information retrieval using local stop words. In: International Advanced Technology Congress: Conference on Computer Integrated Systems, 6-8 Dec. 2005, Putrajaya, Malaysia. . |
institution |
Universiti Putra Malaysia |
building |
UPM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Putra Malaysia |
content_source |
UPM Institutional Repository |
url_provider |
http://psasir.upm.edu.my/ |
language |
English |
description |
This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted and ranked by frequency of occurrence in decreasing order, then the list of the 50 most frequently occurring words in the corpus are obtained. The evaluation of the new Malay stop words list is carried out on the Quranic collection. The employment of the new stop words in which the words from the Quran are compared with the stored stop words lists, results in a sum of 39.7% to 43.6% remaining terms. An experiment was done to evaluate the performance of this stop words lists in terms of the recall and precision for Malay information retrieval system. The results of the experiment without the stop words list and with using stop words lists shown that the employment of the new stop words lists had increased the average precision by 24.8% to 35.0%. The results demonstrate that this list can successfully be used in the Malay information retrieval system. |
format |
Conference or Workshop Item |
author |
Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd |
spellingShingle |
Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd Improvement of Malay information retrieval using local stop words |
author_facet |
Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd |
author_sort |
Abdullah, Muhamad Taufik |
title |
Improvement of Malay information retrieval using local stop words |
title_short |
Improvement of Malay information retrieval using local stop words |
title_full |
Improvement of Malay information retrieval using local stop words |
title_fullStr |
Improvement of Malay information retrieval using local stop words |
title_full_unstemmed |
Improvement of Malay information retrieval using local stop words |
title_sort |
improvement of malay information retrieval using local stop words |
publishDate |
2005 |
url |
http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf http://psasir.upm.edu.my/id/eprint/38975/ |
_version_ |
1643832288674316288 |