Safe level graph for synthetic minority over-sampling techniques

In the class imbalance problem, most existent classifiers which are designed by the distribution of balance datasets fail to recognize minority classes since a large number of negative instances can dominate a few positive instances. Borderline-SMOTE and Safe-Level-SMOTE are over-sampling techniques...

Full description

Saved in:
Bibliographic Details
Main Authors: Chumphol Bunkhumpornpat, Sitthichoke Subpaiboonkit
Format: Conference Proceeding
Published: 2018
Online Access:https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84891076473&origin=inward
http://cmuir.cmu.ac.th/jspui/handle/6653943832/47357
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Chiang Mai University
Description
Summary:In the class imbalance problem, most existent classifiers which are designed by the distribution of balance datasets fail to recognize minority classes since a large number of negative instances can dominate a few positive instances. Borderline-SMOTE and Safe-Level-SMOTE are over-sampling techniques which are applied to handle this situation by generating synthetic instances in different regions. The former operates on the border of a minority class while the latter works inside the class far from the border. Unfortunately, a data miner is unable to conveniently justify a suitable SMOTE for each dataset. In this paper, a safe level graph is proposed as a guideline tool for selecting an appropriate SMOTE and describes the characteristic of a minority class in an imbalance dataset. Relying on advice of a safe level graph, the experimental success rate is shown to reach 73% when an F-measure is used as the performance measure and 78% for satisfactory AUCs. © 2013 IEEE.