Improvement of Malay information retrieval using local stop words

This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted a...

Full description

Saved in:
Bibliographic Details
Main Authors: Abdullah, Muhamad Taufik, Ahmad, Fatimah, Mahmod, Ramlan, Tengku Sembok, Tengku Mohd
Format: Conference or Workshop Item
Language:English
Published: 2005
Online Access:http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf
http://psasir.upm.edu.my/id/eprint/38975/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Putra Malaysia
Language: English
id my.upm.eprints.38975
record_format eprints
spelling my.upm.eprints.389752015-08-24T02:10:26Z http://psasir.upm.edu.my/id/eprint/38975/ Improvement of Malay information retrieval using local stop words Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted and ranked by frequency of occurrence in decreasing order, then the list of the 50 most frequently occurring words in the corpus are obtained. The evaluation of the new Malay stop words list is carried out on the Quranic collection. The employment of the new stop words in which the words from the Quran are compared with the stored stop words lists, results in a sum of 39.7% to 43.6% remaining terms. An experiment was done to evaluate the performance of this stop words lists in terms of the recall and precision for Malay information retrieval system. The results of the experiment without the stop words list and with using stop words lists shown that the employment of the new stop words lists had increased the average precision by 24.8% to 35.0%. The results demonstrate that this list can successfully be used in the Malay information retrieval system. 2005 Conference or Workshop Item NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf Abdullah, Muhamad Taufik and Ahmad, Fatimah and Mahmod, Ramlan and Tengku Sembok, Tengku Mohd (2005) Improvement of Malay information retrieval using local stop words. In: International Advanced Technology Congress: Conference on Computer Integrated Systems, 6-8 Dec. 2005, Putrajaya, Malaysia. .
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted and ranked by frequency of occurrence in decreasing order, then the list of the 50 most frequently occurring words in the corpus are obtained. The evaluation of the new Malay stop words list is carried out on the Quranic collection. The employment of the new stop words in which the words from the Quran are compared with the stored stop words lists, results in a sum of 39.7% to 43.6% remaining terms. An experiment was done to evaluate the performance of this stop words lists in terms of the recall and precision for Malay information retrieval system. The results of the experiment without the stop words list and with using stop words lists shown that the employment of the new stop words lists had increased the average precision by 24.8% to 35.0%. The results demonstrate that this list can successfully be used in the Malay information retrieval system.
format Conference or Workshop Item
author Abdullah, Muhamad Taufik
Ahmad, Fatimah
Mahmod, Ramlan
Tengku Sembok, Tengku Mohd
spellingShingle Abdullah, Muhamad Taufik
Ahmad, Fatimah
Mahmod, Ramlan
Tengku Sembok, Tengku Mohd
Improvement of Malay information retrieval using local stop words
author_facet Abdullah, Muhamad Taufik
Ahmad, Fatimah
Mahmod, Ramlan
Tengku Sembok, Tengku Mohd
author_sort Abdullah, Muhamad Taufik
title Improvement of Malay information retrieval using local stop words
title_short Improvement of Malay information retrieval using local stop words
title_full Improvement of Malay information retrieval using local stop words
title_fullStr Improvement of Malay information retrieval using local stop words
title_full_unstemmed Improvement of Malay information retrieval using local stop words
title_sort improvement of malay information retrieval using local stop words
publishDate 2005
url http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf
http://psasir.upm.edu.my/id/eprint/38975/
_version_ 1643832288674316288