An improved associative classification model using fuzzy parameterized soft set-based decision for text classification
Text classification is applicable in various problem domains, including marketing, security, and biomedical. One of the potential text classifiers is the well-known associative classification approach. However, the existing associative classification approach is still prone to some limitations es...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English English |
Published: |
2023
|
Subjects: | |
Online Access: | http://eprints.uthm.edu.my/10825/1/24p%20DEDE%20ROHIDIN.pdf http://eprints.uthm.edu.my/10825/2/DEDE%20ROHIDIN%20COPYRIGHT%20DECLARATION.pdf http://eprints.uthm.edu.my/10825/3/DEDE%20ROHIDIN%20WATERMARK.pdf http://eprints.uthm.edu.my/10825/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Tun Hussein Onn Malaysia |
Language: | English English English |
Summary: | Text classification is applicable in various problem domains, including marketing,
security, and biomedical. One of the potential text classifiers is the well-known
associative classification approach. However, the existing associative classification
approach is still prone to some limitations especially when dealing with the problem
with too many rules in text classification problem. Some of the rules generated from
the textual data may be irrelevant and redundant, result in low performance in
imbalanced and class overlapping data. Therefore, this research has proposed an
improved associative classification approach to enhance the performance and
efficiency of the text classification by removing the irrelevant rules, reducing
redundant rules, and handling the imbalanced and class overlapping issues in the
textual data. The proposed associative classification approach consists of three stages:
pre-processing, fuzzification and classification. In the classification stage primarily,
this study proposed to integrating principles of fuzzy soft set theory into associative
rules, therefore referred to as Class-Based Fuzzy Soft Associative (CBFSA) method.
The experiments used 20 Newsgroup (balanced data) datasets and Reuter-25178
(imbalanced) to evaluate the proposed model. It shows that CBFSA is successful in
removing irrelevant and reducing redundant rules. The CBFSA classifier applies
smaller number of rules than Class Based Associative (CBA) and Class Based of
Predictive Association Rule (CPAR). The CBFSA is also successful in dealing with
imbalanced and class overlap data. The CBFSA performance is higher and faster than
CBA and CPAR. Meanwhile, comparative analysis with some other non-associative
based classifiers may achieve improved f1-measure between 6% to 32%. The
processing time of CBFSA is faster than RNN and CNN but slightly slower than
Decision Tree, k-NN, Naïve Bayes, Roccio, Bagging and Boosting |
---|