Safe level graph for majority under-sampling techniques

© 2014, Chiang Mai University. All rights reserved. In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class im...

Full description

Saved in:
Bibliographic Details
Main Author: Chumphol Bunkhumpornpat
Format: Journal
Published: 2018
Online Access:https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84936056675&origin=inward
http://cmuir.cmu.ac.th/jspui/handle/6653943832/45475
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Chiang Mai University
id th-cmuir.6653943832-45475
record_format dspace
spelling th-cmuir.6653943832-454752018-01-24T06:11:01Z Safe level graph for majority under-sampling techniques Chumphol Bunkhumpornpat © 2014, Chiang Mai University. All rights reserved. In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class imbalance problem, over- and under-sampling are applied at the data level. Over-sampling duplicates or synthesizes instances into a minority class. Although redundant instances do not harm correct classifications, they increase classification costs. Additionally, while synthetic instances expand the learning region, they are not actual instances. Under-sampling removes instances from a majority class to remedy the overlapping problem. Consequently, a downsized dataset can speed up a classification algorithm. This research investigates the behavior of several under-sampling techniques, while cleansing distinct majority class regions. We also propose a safe level graph to justify an appropriate parameter of our prior work, MUTE. The experiment shows that our decision from a safe level graph can improve the F-measure of RIPPER when evaluating minority classes. 2018-01-24T06:11:01Z 2018-01-24T06:11:01Z 2014-01-01 Journal 01252526 2-s2.0-84936056675 https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84936056675&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/45475
institution Chiang Mai University
building Chiang Mai University Library
country Thailand
collection CMU Intellectual Repository
description © 2014, Chiang Mai University. All rights reserved. In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class imbalance problem, over- and under-sampling are applied at the data level. Over-sampling duplicates or synthesizes instances into a minority class. Although redundant instances do not harm correct classifications, they increase classification costs. Additionally, while synthetic instances expand the learning region, they are not actual instances. Under-sampling removes instances from a majority class to remedy the overlapping problem. Consequently, a downsized dataset can speed up a classification algorithm. This research investigates the behavior of several under-sampling techniques, while cleansing distinct majority class regions. We also propose a safe level graph to justify an appropriate parameter of our prior work, MUTE. The experiment shows that our decision from a safe level graph can improve the F-measure of RIPPER when evaluating minority classes.
format Journal
author Chumphol Bunkhumpornpat
spellingShingle Chumphol Bunkhumpornpat
Safe level graph for majority under-sampling techniques
author_facet Chumphol Bunkhumpornpat
author_sort Chumphol Bunkhumpornpat
title Safe level graph for majority under-sampling techniques
title_short Safe level graph for majority under-sampling techniques
title_full Safe level graph for majority under-sampling techniques
title_fullStr Safe level graph for majority under-sampling techniques
title_full_unstemmed Safe level graph for majority under-sampling techniques
title_sort safe level graph for majority under-sampling techniques
publishDate 2018
url https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84936056675&origin=inward
http://cmuir.cmu.ac.th/jspui/handle/6653943832/45475
_version_ 1681422752173522944