Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique

Imbalanced datasets typically occur in many real applications. Resampling is one of the effective solutions due to producing a balanced class distribution. Synthetic Minority Over-sampling technique (SMOTE), an over-sampling technique is used in this study for dealing the imbalanced dataset by add...

Full description

Saved in:
Bibliographic Details
Main Authors: Mumtazimah, Mohamad, Mohd, F, Abdul Jalil, M, Noora, N.M.M, Ismail, S, Yahya, W.F.F
Format: Conference or Workshop Item
Language:English
Published: 2019
Subjects:
Online Access:http://eprints.unisza.edu.my/1819/1/FH03-FIK-20-39681.pdf
http://eprints.unisza.edu.my/1819/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Sultan Zainal Abidin
Language: English
id my-unisza-ir.1819
record_format eprints
spelling my-unisza-ir.18192020-11-23T04:13:42Z http://eprints.unisza.edu.my/1819/ Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique Mumtazimah, Mohamad Mohd, F Abdul Jalil, M Noora, N.M.M Ismail, S Yahya, W.F.F QA76 Computer software TA Engineering (General). Civil engineering (General) ZA4050 Electronic information resources Imbalanced datasets typically occur in many real applications. Resampling is one of the effective solutions due to producing a balanced class distribution. Synthetic Minority Over-sampling technique (SMOTE), an over-sampling technique is used in this study for dealing the imbalanced dataset by add the number of instances of a minority class. This technique is used to decrease the imbalance percentage of the dataset by generating new synthetic samples. Thus, a balanced training dataset is produced to replace the class imbalanced . The balanced datasets were obtained and trained with machine learning algorithms to diagnose the disease’s class. Through the experiment findings on the real-world datasets, oral cancer dataset and erythemato-squamous diseases dataset from the UCI machine learning datasets, an over-sampling method showed better results in clinical disease classification. 2019 Conference or Workshop Item NonPeerReviewed text en http://eprints.unisza.edu.my/1819/1/FH03-FIK-20-39681.pdf Mumtazimah, Mohamad and Mohd, F and Abdul Jalil, M and Noora, N.M.M and Ismail, S and Yahya, W.F.F (2019) Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique. In: 1st International Conference on Intelligent Cloud Computing, ICC 2019, 10-12 December 2019, Riyadh; Saudi Arabia.
institution Universiti Sultan Zainal Abidin
building UNISZA Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Sultan Zainal Abidin
content_source UNISZA Institutional Repository
url_provider https://eprints.unisza.edu.my/
language English
topic QA76 Computer software
TA Engineering (General). Civil engineering (General)
ZA4050 Electronic information resources
spellingShingle QA76 Computer software
TA Engineering (General). Civil engineering (General)
ZA4050 Electronic information resources
Mumtazimah, Mohamad
Mohd, F
Abdul Jalil, M
Noora, N.M.M
Ismail, S
Yahya, W.F.F
Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique
description Imbalanced datasets typically occur in many real applications. Resampling is one of the effective solutions due to producing a balanced class distribution. Synthetic Minority Over-sampling technique (SMOTE), an over-sampling technique is used in this study for dealing the imbalanced dataset by add the number of instances of a minority class. This technique is used to decrease the imbalance percentage of the dataset by generating new synthetic samples. Thus, a balanced training dataset is produced to replace the class imbalanced . The balanced datasets were obtained and trained with machine learning algorithms to diagnose the disease’s class. Through the experiment findings on the real-world datasets, oral cancer dataset and erythemato-squamous diseases dataset from the UCI machine learning datasets, an over-sampling method showed better results in clinical disease classification.
format Conference or Workshop Item
author Mumtazimah, Mohamad
Mohd, F
Abdul Jalil, M
Noora, N.M.M
Ismail, S
Yahya, W.F.F
author_facet Mumtazimah, Mohamad
Mohd, F
Abdul Jalil, M
Noora, N.M.M
Ismail, S
Yahya, W.F.F
author_sort Mumtazimah, Mohamad
title Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique
title_short Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique
title_full Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique
title_fullStr Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique
title_full_unstemmed Improving Accuracy of Imbalanced Clinical Data Classification Using Synthetic Minority Over-Sampling Technique
title_sort improving accuracy of imbalanced clinical data classification using synthetic minority over-sampling technique
publishDate 2019
url http://eprints.unisza.edu.my/1819/1/FH03-FIK-20-39681.pdf
http://eprints.unisza.edu.my/1819/
_version_ 1684657757833134080