Safe level graph for majority under-sampling techniques

© 2014, Chiang Mai University. All rights reserved. In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class im...

Full description

Saved in:

Bibliographic Details
Main Author:	Chumphol Bunkhumpornpat
Format:	Journal
Published:	2018
Subjects:	Biochemistry, Genetics and Molecular Biology Chemistry Materials Science Mathematics Physics and Astronomy
Online Access:	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84936056675&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/53273
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Chiang Mai University

id	th-cmuir.6653943832-53273
record_format	dspace
spelling	th-cmuir.6653943832-532732018-09-04T10:01:57Z Safe level graph for majority under-sampling techniques Chumphol Bunkhumpornpat Biochemistry, Genetics and Molecular Biology Chemistry Materials Science Mathematics Physics and Astronomy © 2014, Chiang Mai University. All rights reserved. In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class imbalance problem, over- and under-sampling are applied at the data level. Over-sampling duplicates or synthesizes instances into a minority class. Although redundant instances do not harm correct classifications, they increase classification costs. Additionally, while synthetic instances expand the learning region, they are not actual instances. Under-sampling removes instances from a majority class to remedy the overlapping problem. Consequently, a downsized dataset can speed up a classification algorithm. This research investigates the behavior of several under-sampling techniques, while cleansing distinct majority class regions. We also propose a safe level graph to justify an appropriate parameter of our prior work, MUTE. The experiment shows that our decision from a safe level graph can improve the F-measure of RIPPER when evaluating minority classes. 2018-09-04T09:46:10Z 2018-09-04T09:46:10Z 2014-01-01 Journal 01252526 2-s2.0-84936056675 https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84936056675&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/53273
institution	Chiang Mai University
building	Chiang Mai University Library
country	Thailand
collection	CMU Intellectual Repository
topic	Biochemistry, Genetics and Molecular Biology Chemistry Materials Science Mathematics Physics and Astronomy
spellingShingle	Biochemistry, Genetics and Molecular Biology Chemistry Materials Science Mathematics Physics and Astronomy Chumphol Bunkhumpornpat Safe level graph for majority under-sampling techniques
description	© 2014, Chiang Mai University. All rights reserved. In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class imbalance problem, over- and under-sampling are applied at the data level. Over-sampling duplicates or synthesizes instances into a minority class. Although redundant instances do not harm correct classifications, they increase classification costs. Additionally, while synthetic instances expand the learning region, they are not actual instances. Under-sampling removes instances from a majority class to remedy the overlapping problem. Consequently, a downsized dataset can speed up a classification algorithm. This research investigates the behavior of several under-sampling techniques, while cleansing distinct majority class regions. We also propose a safe level graph to justify an appropriate parameter of our prior work, MUTE. The experiment shows that our decision from a safe level graph can improve the F-measure of RIPPER when evaluating minority classes.
format	Journal
author	Chumphol Bunkhumpornpat
author_facet	Chumphol Bunkhumpornpat
author_sort	Chumphol Bunkhumpornpat
title	Safe level graph for majority under-sampling techniques
title_short	Safe level graph for majority under-sampling techniques
title_full	Safe level graph for majority under-sampling techniques
title_fullStr	Safe level graph for majority under-sampling techniques
title_full_unstemmed	Safe level graph for majority under-sampling techniques
title_sort	safe level graph for majority under-sampling techniques
publishDate	2018
url	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84936056675&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/53273
_version_	1681424104511504384

Safe level graph for majority under-sampling techniques

Similar Items