HYBRID SAMPLING METHOD BASED ON DBSCAN AND PARTICLE SWARM OPTIMIZATION (PSO) FOR IMBALANCED DATA CLASSIFICATION
Imbalanced data refer to data condition whose significant disparity between the number of data points in one class compared to another class. In some cases of imbalanced data, classification algorithms may not accurately predict the minority class even though they achieve high accuracy. However,...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/78368 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Imbalanced data refer to data condition whose significant disparity between the
number of data points in one class compared to another class. In some cases of
imbalanced data, classification algorithms may not accurately predict the minority
class even though they achieve high accuracy. However, accurate prediction of the
minority class is most important, for example in cases of rare medical disease
diagnosis where it is crucial to detect the disease. To address the issue of
imbalanced data, this research proposes a hybridsampling method that combines
the undersampling method proposed by Mirzaei et al. and the oversampling method
proposed by Xiaolong et al., where both methods are performed based on density
using the DBSCAN algorithm for resampling. However, the DBSCAN algorithm is
highly sensitive to the minPts and Eps values, so other research has used Particle
Swarm Optimization (PSO) to determine these two parameters. Therefore, the
hybridsampling method that proposed in this research uses Particle Swarm
Optimization (PSO) to determine the minPts and Eps paramters values in the
DBSCAN algorithm used for both undersampling and oversampling. |
---|