Fuzzy distance-based undersampling technique for imbalanced flood data

Performances of classifiers are affected by imbalanced data because instances in the minority class are often ignored. Imbalanced data often occur in many application domains including flood. If flood cases are misclassified, the impact of flood is higher than the misclassification of non-flood cas...

Full description

Saved in:
Bibliographic Details
Main Authors: Ku-Mahamud, Ku Ruhana, Zorkeflee, Maisarah, Mohamed Din, Aniza
Format: Conference or Workshop Item
Language:English
Published: 2016
Subjects:
Online Access:http://repo.uum.edu.my/20158/1/KMICe2016%20509%20513.pdf
http://repo.uum.edu.my/20158/
http://www.kmice.cms.net.my/kmice2016/files/KMICe2016_eproceeding.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Utara Malaysia
Language: English
id my.uum.repo.20158
record_format eprints
spelling my.uum.repo.201582016-12-01T07:23:21Z http://repo.uum.edu.my/20158/ Fuzzy distance-based undersampling technique for imbalanced flood data Ku-Mahamud, Ku Ruhana Zorkeflee, Maisarah Mohamed Din, Aniza QA75 Electronic computers. Computer science Performances of classifiers are affected by imbalanced data because instances in the minority class are often ignored. Imbalanced data often occur in many application domains including flood. If flood cases are misclassified, the impact of flood is higher than the misclassification of non-flood cases.Numerous resampling techniques such as undersampling and oversampling have been used to overcome the problem of misclassification of imbalanced data.However, the undersampling and oversampling techniques suffer from elimination of relevant data and overfitting, which may lead to poor classification results.This paper proposes a Fuzzy Distance-based Undersampling (FDUS) technique to increase classification accuracy. Entropy estimation is used to generate fuzzy thresholds which are used to categorise the instances in majority and minority classes into membership functions. The performance of FDUS was compared with three techniques based on Fmeasure and G-mean, experimented on flood data. From the results, FDUS achieved better F-measure and G-mean compared to the other techniques which showed that the FDUS was able to reduce the elimination of relevant data. 2016-08-29 Conference or Workshop Item PeerReviewed application/pdf en http://repo.uum.edu.my/20158/1/KMICe2016%20509%20513.pdf Ku-Mahamud, Ku Ruhana and Zorkeflee, Maisarah and Mohamed Din, Aniza (2016) Fuzzy distance-based undersampling technique for imbalanced flood data. In: Knowledge Management International Conference (KMICe) 2016, 29 – 30 August 2016, Chiang Mai, Thailand. http://www.kmice.cms.net.my/kmice2016/files/KMICe2016_eproceeding.pdf
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutionali Repository
url_provider http://repo.uum.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Ku-Mahamud, Ku Ruhana
Zorkeflee, Maisarah
Mohamed Din, Aniza
Fuzzy distance-based undersampling technique for imbalanced flood data
description Performances of classifiers are affected by imbalanced data because instances in the minority class are often ignored. Imbalanced data often occur in many application domains including flood. If flood cases are misclassified, the impact of flood is higher than the misclassification of non-flood cases.Numerous resampling techniques such as undersampling and oversampling have been used to overcome the problem of misclassification of imbalanced data.However, the undersampling and oversampling techniques suffer from elimination of relevant data and overfitting, which may lead to poor classification results.This paper proposes a Fuzzy Distance-based Undersampling (FDUS) technique to increase classification accuracy. Entropy estimation is used to generate fuzzy thresholds which are used to categorise the instances in majority and minority classes into membership functions. The performance of FDUS was compared with three techniques based on Fmeasure and G-mean, experimented on flood data. From the results, FDUS achieved better F-measure and G-mean compared to the other techniques which showed that the FDUS was able to reduce the elimination of relevant data.
format Conference or Workshop Item
author Ku-Mahamud, Ku Ruhana
Zorkeflee, Maisarah
Mohamed Din, Aniza
author_facet Ku-Mahamud, Ku Ruhana
Zorkeflee, Maisarah
Mohamed Din, Aniza
author_sort Ku-Mahamud, Ku Ruhana
title Fuzzy distance-based undersampling technique for imbalanced flood data
title_short Fuzzy distance-based undersampling technique for imbalanced flood data
title_full Fuzzy distance-based undersampling technique for imbalanced flood data
title_fullStr Fuzzy distance-based undersampling technique for imbalanced flood data
title_full_unstemmed Fuzzy distance-based undersampling technique for imbalanced flood data
title_sort fuzzy distance-based undersampling technique for imbalanced flood data
publishDate 2016
url http://repo.uum.edu.my/20158/1/KMICe2016%20509%20513.pdf
http://repo.uum.edu.my/20158/
http://www.kmice.cms.net.my/kmice2016/files/KMICe2016_eproceeding.pdf
_version_ 1644282877854089216