Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique
Imbalanced datasets typically occur in many real applications. Resampling is one of the effective solutions due to producing a balanced class distribution. Synthetic Minority Over-sampling technique (SMOTE), an over-sampling technique is used in this study for dealing the imbalanced dataset by add...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | http://eprints.unisza.edu.my/1819/1/FH03-FIK-20-39681.pdf http://eprints.unisza.edu.my/1819/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Sultan Zainal Abidin |
Language: | English |
id |
my-unisza-ir.1819 |
---|---|
record_format |
eprints |
spelling |
my-unisza-ir.18192020-11-23T04:13:42Z http://eprints.unisza.edu.my/1819/ Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique Mumtazimah, Mohamad Mohd, F Abdul Jalil, M Noora, N.M.M Ismail, S Yahya, W.F.F QA76 Computer software TA Engineering (General). Civil engineering (General) ZA4050 Electronic information resources Imbalanced datasets typically occur in many real applications. Resampling is one of the effective solutions due to producing a balanced class distribution. Synthetic Minority Over-sampling technique (SMOTE), an over-sampling technique is used in this study for dealing the imbalanced dataset by add the number of instances of a minority class. This technique is used to decrease the imbalance percentage of the dataset by generating new synthetic samples. Thus, a balanced training dataset is produced to replace the class imbalanced . The balanced datasets were obtained and trained with machine learning algorithms to diagnose the disease’s class. Through the experiment findings on the real-world datasets, oral cancer dataset and erythemato-squamous diseases dataset from the UCI machine learning datasets, an over-sampling method showed better results in clinical disease classification. 2019 Conference or Workshop Item NonPeerReviewed text en http://eprints.unisza.edu.my/1819/1/FH03-FIK-20-39681.pdf Mumtazimah, Mohamad and Mohd, F and Abdul Jalil, M and Noora, N.M.M and Ismail, S and Yahya, W.F.F (2019) Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique. In: 1st International Conference on Intelligent Cloud Computing, ICC 2019, 10-12 December 2019, Riyadh; Saudi Arabia. |
institution |
Universiti Sultan Zainal Abidin |
building |
UNISZA Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Sultan Zainal Abidin |
content_source |
UNISZA Institutional Repository |
url_provider |
https://eprints.unisza.edu.my/ |
language |
English |
topic |
QA76 Computer software TA Engineering (General). Civil engineering (General) ZA4050 Electronic information resources |
spellingShingle |
QA76 Computer software TA Engineering (General). Civil engineering (General) ZA4050 Electronic information resources Mumtazimah, Mohamad Mohd, F Abdul Jalil, M Noora, N.M.M Ismail, S Yahya, W.F.F Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique |
description |
Imbalanced datasets typically occur in many real applications. Resampling is one of the effective solutions due to
producing a balanced class distribution. Synthetic Minority Over-sampling technique (SMOTE), an over-sampling
technique is used in this study for dealing the imbalanced dataset by add the number of instances of a minority
class. This technique is used to decrease the imbalance percentage of the dataset by generating new synthetic
samples. Thus, a balanced training dataset is produced to replace the class imbalanced . The balanced datasets were
obtained and trained with machine learning algorithms to diagnose the disease’s class. Through the experiment
findings on the real-world datasets, oral cancer dataset and erythemato-squamous diseases dataset from the UCI
machine learning datasets, an over-sampling method showed better results in clinical disease classification. |
format |
Conference or Workshop Item |
author |
Mumtazimah, Mohamad Mohd, F Abdul Jalil, M Noora, N.M.M Ismail, S Yahya, W.F.F |
author_facet |
Mumtazimah, Mohamad Mohd, F Abdul Jalil, M Noora, N.M.M Ismail, S Yahya, W.F.F |
author_sort |
Mumtazimah, Mohamad |
title |
Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique |
title_short |
Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique |
title_full |
Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique |
title_fullStr |
Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique |
title_full_unstemmed |
Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique |
title_sort |
improving accuracy of imbalanced clinical data classification using synthetic minority over-sampling technique |
publishDate |
2019 |
url |
http://eprints.unisza.edu.my/1819/1/FH03-FIK-20-39681.pdf http://eprints.unisza.edu.my/1819/ |
_version_ |
1684657757833134080 |