Fuzzy distance-based undersampling technique for imbalanced flood data
Performances of classifiers are affected by imbalanced data because instances in the minority class are often ignored. Imbalanced data often occur in many application domains including flood. If flood cases are misclassified, the impact of flood is higher than the misclassification of non-flood cas...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://repo.uum.edu.my/20158/1/KMICe2016%20509%20513.pdf http://repo.uum.edu.my/20158/ http://www.kmice.cms.net.my/kmice2016/files/KMICe2016_eproceeding.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Utara Malaysia |
Language: | English |
id |
my.uum.repo.20158 |
---|---|
record_format |
eprints |
spelling |
my.uum.repo.201582016-12-01T07:23:21Z http://repo.uum.edu.my/20158/ Fuzzy distance-based undersampling technique for imbalanced flood data Ku-Mahamud, Ku Ruhana Zorkeflee, Maisarah Mohamed Din, Aniza QA75 Electronic computers. Computer science Performances of classifiers are affected by imbalanced data because instances in the minority class are often ignored. Imbalanced data often occur in many application domains including flood. If flood cases are misclassified, the impact of flood is higher than the misclassification of non-flood cases.Numerous resampling techniques such as undersampling and oversampling have been used to overcome the problem of misclassification of imbalanced data.However, the undersampling and oversampling techniques suffer from elimination of relevant data and overfitting, which may lead to poor classification results.This paper proposes a Fuzzy Distance-based Undersampling (FDUS) technique to increase classification accuracy. Entropy estimation is used to generate fuzzy thresholds which are used to categorise the instances in majority and minority classes into membership functions. The performance of FDUS was compared with three techniques based on Fmeasure and G-mean, experimented on flood data. From the results, FDUS achieved better F-measure and G-mean compared to the other techniques which showed that the FDUS was able to reduce the elimination of relevant data. 2016-08-29 Conference or Workshop Item PeerReviewed application/pdf en http://repo.uum.edu.my/20158/1/KMICe2016%20509%20513.pdf Ku-Mahamud, Ku Ruhana and Zorkeflee, Maisarah and Mohamed Din, Aniza (2016) Fuzzy distance-based undersampling technique for imbalanced flood data. In: Knowledge Management International Conference (KMICe) 2016, 29 – 30 August 2016, Chiang Mai, Thailand. http://www.kmice.cms.net.my/kmice2016/files/KMICe2016_eproceeding.pdf |
institution |
Universiti Utara Malaysia |
building |
UUM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Utara Malaysia |
content_source |
UUM Institutionali Repository |
url_provider |
http://repo.uum.edu.my/ |
language |
English |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Ku-Mahamud, Ku Ruhana Zorkeflee, Maisarah Mohamed Din, Aniza Fuzzy distance-based undersampling technique for imbalanced flood data |
description |
Performances of classifiers are affected by imbalanced data because instances in the minority
class are often ignored. Imbalanced data often occur in many application domains including flood. If flood cases are misclassified, the impact of flood is higher than the misclassification of non-flood cases.Numerous resampling techniques such as
undersampling and oversampling have been used to overcome the problem of misclassification of
imbalanced data.However, the undersampling and
oversampling techniques suffer from elimination of
relevant data and overfitting, which may lead to
poor classification results.This paper proposes a
Fuzzy Distance-based Undersampling (FDUS) technique to increase classification accuracy. Entropy estimation is used to generate fuzzy
thresholds which are used to categorise the
instances in majority and minority classes into
membership functions. The performance of FDUS
was compared with three techniques based on Fmeasure and G-mean, experimented on flood data.
From the results, FDUS achieved better F-measure
and G-mean compared to the other techniques
which showed that the FDUS was able to reduce
the elimination of relevant data. |
format |
Conference or Workshop Item |
author |
Ku-Mahamud, Ku Ruhana Zorkeflee, Maisarah Mohamed Din, Aniza |
author_facet |
Ku-Mahamud, Ku Ruhana Zorkeflee, Maisarah Mohamed Din, Aniza |
author_sort |
Ku-Mahamud, Ku Ruhana |
title |
Fuzzy distance-based undersampling technique for imbalanced flood data |
title_short |
Fuzzy distance-based undersampling technique for imbalanced flood data |
title_full |
Fuzzy distance-based undersampling technique for imbalanced flood data |
title_fullStr |
Fuzzy distance-based undersampling technique for imbalanced flood data |
title_full_unstemmed |
Fuzzy distance-based undersampling technique for imbalanced flood data |
title_sort |
fuzzy distance-based undersampling technique for imbalanced flood data |
publishDate |
2016 |
url |
http://repo.uum.edu.my/20158/1/KMICe2016%20509%20513.pdf http://repo.uum.edu.my/20158/ http://www.kmice.cms.net.my/kmice2016/files/KMICe2016_eproceeding.pdf |
_version_ |
1644282877854089216 |