Rule-based filtering algorithm for textual document

Textual document is usually in unstructured form and high dimensional data.The exploration of hidden information from the unstructured text is useful to find interesting patterns and valuable knowledge.However, not all terms in the text are relevant and can lead to misclassification. Improper filtr...

Full description

Saved in:
Bibliographic Details
Main Authors: Jamil, Nurul Syafidah, Ku-Mahamud, Ku Ruhana, Mohamed Din, Aniza
Format: Article
Language:English
Published: 2017
Subjects:
Online Access:http://repo.uum.edu.my/21718/1/IJSEI%20%206%2061%202017%2044%2048.pdf
http://repo.uum.edu.my/21718/
http://www.ijsei.com/archive-66117.htm
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Utara Malaysia
Language: English
id my.uum.repo.21718
record_format eprints
spelling my.uum.repo.217182017-04-19T07:42:01Z http://repo.uum.edu.my/21718/ Rule-based filtering algorithm for textual document Jamil, Nurul Syafidah Ku-Mahamud, Ku Ruhana Mohamed Din, Aniza QA76 Computer software Textual document is usually in unstructured form and high dimensional data.The exploration of hidden information from the unstructured text is useful to find interesting patterns and valuable knowledge.However, not all terms in the text are relevant and can lead to misclassification. Improper filtration might cause terms that have similar meaning to be removed.Thus, to reduce the high-dimensionality of text, this study proposed a filtering algorithm that is able to filter the important terms from the pre-processed text and applied term weighting scheme to solve synonym problem which will help the selection of relevant term.The proposed filtering algorithm utilizes a keyword library that contained special terms which is developed to ensure that important terms are not eliminated during filtration process.The performance of the proposed filtering algorithm is compared with rough set attribute reduction (RSAR) and information retrieval (IR) approaches.From the experiment, the proposed filtering algorithm has outperformed both RSAR and IR in terms of extracted relevant terms. 2017-02 Article PeerReviewed application/pdf en cc4_by_sa http://repo.uum.edu.my/21718/1/IJSEI%20%206%2061%202017%2044%2048.pdf Jamil, Nurul Syafidah and Ku-Mahamud, Ku Ruhana and Mohamed Din, Aniza (2017) Rule-based filtering algorithm for textual document. International Journal of Science and Engineering Investigations, 6 (61). pp. 44-48. ISSN 2251-8843 http://www.ijsei.com/archive-66117.htm
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutionali Repository
url_provider http://repo.uum.edu.my/
language English
topic QA76 Computer software
spellingShingle QA76 Computer software
Jamil, Nurul Syafidah
Ku-Mahamud, Ku Ruhana
Mohamed Din, Aniza
Rule-based filtering algorithm for textual document
description Textual document is usually in unstructured form and high dimensional data.The exploration of hidden information from the unstructured text is useful to find interesting patterns and valuable knowledge.However, not all terms in the text are relevant and can lead to misclassification. Improper filtration might cause terms that have similar meaning to be removed.Thus, to reduce the high-dimensionality of text, this study proposed a filtering algorithm that is able to filter the important terms from the pre-processed text and applied term weighting scheme to solve synonym problem which will help the selection of relevant term.The proposed filtering algorithm utilizes a keyword library that contained special terms which is developed to ensure that important terms are not eliminated during filtration process.The performance of the proposed filtering algorithm is compared with rough set attribute reduction (RSAR) and information retrieval (IR) approaches.From the experiment, the proposed filtering algorithm has outperformed both RSAR and IR in terms of extracted relevant terms.
format Article
author Jamil, Nurul Syafidah
Ku-Mahamud, Ku Ruhana
Mohamed Din, Aniza
author_facet Jamil, Nurul Syafidah
Ku-Mahamud, Ku Ruhana
Mohamed Din, Aniza
author_sort Jamil, Nurul Syafidah
title Rule-based filtering algorithm for textual document
title_short Rule-based filtering algorithm for textual document
title_full Rule-based filtering algorithm for textual document
title_fullStr Rule-based filtering algorithm for textual document
title_full_unstemmed Rule-based filtering algorithm for textual document
title_sort rule-based filtering algorithm for textual document
publishDate 2017
url http://repo.uum.edu.my/21718/1/IJSEI%20%206%2061%202017%2044%2048.pdf
http://repo.uum.edu.my/21718/
http://www.ijsei.com/archive-66117.htm
_version_ 1644283316511178752