Safe level graph for majority under-sampling techniques

© 2014, Chiang Mai University. All rights reserved. In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class im...

Full description

Saved in:
Bibliographic Details
Main Author: Chumphol Bunkhumpornpat
Format: Journal
Published: 2018
Subjects:
Online Access:https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84936056675&origin=inward
http://cmuir.cmu.ac.th/jspui/handle/6653943832/53273
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Chiang Mai University
id th-cmuir.6653943832-53273
record_format dspace
spelling th-cmuir.6653943832-532732018-09-04T10:01:57Z Safe level graph for majority under-sampling techniques Chumphol Bunkhumpornpat Biochemistry, Genetics and Molecular Biology Chemistry Materials Science Mathematics Physics and Astronomy © 2014, Chiang Mai University. All rights reserved. In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class imbalance problem, over- and under-sampling are applied at the data level. Over-sampling duplicates or synthesizes instances into a minority class. Although redundant instances do not harm correct classifications, they increase classification costs. Additionally, while synthetic instances expand the learning region, they are not actual instances. Under-sampling removes instances from a majority class to remedy the overlapping problem. Consequently, a downsized dataset can speed up a classification algorithm. This research investigates the behavior of several under-sampling techniques, while cleansing distinct majority class regions. We also propose a safe level graph to justify an appropriate parameter of our prior work, MUTE. The experiment shows that our decision from a safe level graph can improve the F-measure of RIPPER when evaluating minority classes. 2018-09-04T09:46:10Z 2018-09-04T09:46:10Z 2014-01-01 Journal 01252526 2-s2.0-84936056675 https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84936056675&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/53273
institution Chiang Mai University
building Chiang Mai University Library
country Thailand
collection CMU Intellectual Repository
topic Biochemistry, Genetics and Molecular Biology
Chemistry
Materials Science
Mathematics
Physics and Astronomy
spellingShingle Biochemistry, Genetics and Molecular Biology
Chemistry
Materials Science
Mathematics
Physics and Astronomy
Chumphol Bunkhumpornpat
Safe level graph for majority under-sampling techniques
description © 2014, Chiang Mai University. All rights reserved. In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class imbalance problem, over- and under-sampling are applied at the data level. Over-sampling duplicates or synthesizes instances into a minority class. Although redundant instances do not harm correct classifications, they increase classification costs. Additionally, while synthetic instances expand the learning region, they are not actual instances. Under-sampling removes instances from a majority class to remedy the overlapping problem. Consequently, a downsized dataset can speed up a classification algorithm. This research investigates the behavior of several under-sampling techniques, while cleansing distinct majority class regions. We also propose a safe level graph to justify an appropriate parameter of our prior work, MUTE. The experiment shows that our decision from a safe level graph can improve the F-measure of RIPPER when evaluating minority classes.
format Journal
author Chumphol Bunkhumpornpat
author_facet Chumphol Bunkhumpornpat
author_sort Chumphol Bunkhumpornpat
title Safe level graph for majority under-sampling techniques
title_short Safe level graph for majority under-sampling techniques
title_full Safe level graph for majority under-sampling techniques
title_fullStr Safe level graph for majority under-sampling techniques
title_full_unstemmed Safe level graph for majority under-sampling techniques
title_sort safe level graph for majority under-sampling techniques
publishDate 2018
url https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84936056675&origin=inward
http://cmuir.cmu.ac.th/jspui/handle/6653943832/53273
_version_ 1681424104511504384