Survey on highly imbalanced multi-class data
Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providi...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
The Science and Information Organization
2022
|
Online Access: | http://eprints.utem.edu.my/id/eprint/26188/2/PAPER_27-SURVEY_ON_HIGHLY_IMBALANCED_MULTI_CLASS_DATA.PDF http://eprints.utem.edu.my/id/eprint/26188/ https://thesai.org/Downloads/Volume13No6/Paper_27-Survey_on_Highly_Imbalanced_Multi_class_Data.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknikal Malaysia Melaka |
Language: | English |
id |
my.utem.eprints.26188 |
---|---|
record_format |
eprints |
spelling |
my.utem.eprints.261882023-02-10T15:15:54Z http://eprints.utem.edu.my/id/eprint/26188/ Survey on highly imbalanced multi-class data Abdul Hamid, Mohd Hakim Yusoff, Marina Mohamed, Azlinah Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providing unbiased solutions. From the earliest literatures published on highly imbalanced data until recently, machine learning research has focused mostly on binary classification data problems. Research on highly imbalanced multi-class data is still greatly unexplored when the need for better analysis and predictions in handling Big Data is required. This study focuses on reviews related to the models or techniques in handling highly imbalanced multi-class data, along with their strengths and weaknesses and related domains. Furthermore, the paper uses the statistical method to explore a case study with a severely imbalanced dataset. This article aims to (1) understand the trend of highly imbalanced multi-class data through analysis of related literatures; (2) analyze the previous and current methods of handling highly imbalanced multi-class data; (3) construct a framework of highly imbalanced multi-class data. The chosen highly imbalanced multi-class dataset analysis will also be performed and adapted to the current methods or techniques in machine learning, followed by discussions on open challenges and the future direction of highly imbalanced multi-class data. Finally, for highly imbalanced multi-class data, this paper presents a novel framework. We hope this research can provide insights on the potential development of better methods or techniques to handle and manipulate highly imbalanced multi-class data. The Science and Information Organization 2022 Article PeerReviewed text en http://eprints.utem.edu.my/id/eprint/26188/2/PAPER_27-SURVEY_ON_HIGHLY_IMBALANCED_MULTI_CLASS_DATA.PDF Abdul Hamid, Mohd Hakim and Yusoff, Marina and Mohamed, Azlinah (2022) Survey on highly imbalanced multi-class data. (IJACSA) INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 13 (6). pp. 211-229. ISSN 2156-5570 https://thesai.org/Downloads/Volume13No6/Paper_27-Survey_on_Highly_Imbalanced_Multi_class_Data.pdf 10.14569/IJACSA.2022.0130627 |
institution |
Universiti Teknikal Malaysia Melaka |
building |
UTEM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknikal Malaysia Melaka |
content_source |
UTEM Institutional Repository |
url_provider |
http://eprints.utem.edu.my/ |
language |
English |
description |
Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providing unbiased solutions. From the earliest literatures published on highly imbalanced data until recently, machine learning research has focused mostly on binary classification data problems. Research on highly imbalanced multi-class data is still greatly unexplored when the need for better analysis and predictions in handling Big Data is required. This study focuses on reviews related to the models or techniques in handling highly imbalanced multi-class data, along with their strengths and weaknesses and related domains. Furthermore, the paper uses the statistical method to explore a case study with a severely imbalanced dataset. This article aims to (1) understand the trend of highly imbalanced multi-class data through analysis of related literatures; (2) analyze the previous and current methods of handling highly imbalanced multi-class data; (3) construct a framework of highly imbalanced multi-class data. The chosen highly imbalanced multi-class dataset analysis will also be performed and adapted to the current methods or techniques in machine learning, followed by discussions on open challenges and the future direction of highly imbalanced multi-class data. Finally, for highly imbalanced multi-class data, this paper presents a novel framework. We hope this research can provide insights on the potential development of better methods or techniques to handle and manipulate highly imbalanced multi-class data. |
format |
Article |
author |
Abdul Hamid, Mohd Hakim Yusoff, Marina Mohamed, Azlinah |
spellingShingle |
Abdul Hamid, Mohd Hakim Yusoff, Marina Mohamed, Azlinah Survey on highly imbalanced multi-class data |
author_facet |
Abdul Hamid, Mohd Hakim Yusoff, Marina Mohamed, Azlinah |
author_sort |
Abdul Hamid, Mohd Hakim |
title |
Survey on highly imbalanced multi-class data |
title_short |
Survey on highly imbalanced multi-class data |
title_full |
Survey on highly imbalanced multi-class data |
title_fullStr |
Survey on highly imbalanced multi-class data |
title_full_unstemmed |
Survey on highly imbalanced multi-class data |
title_sort |
survey on highly imbalanced multi-class data |
publisher |
The Science and Information Organization |
publishDate |
2022 |
url |
http://eprints.utem.edu.my/id/eprint/26188/2/PAPER_27-SURVEY_ON_HIGHLY_IMBALANCED_MULTI_CLASS_DATA.PDF http://eprints.utem.edu.my/id/eprint/26188/ https://thesai.org/Downloads/Volume13No6/Paper_27-Survey_on_Highly_Imbalanced_Multi_class_Data.pdf |
_version_ |
1758582071488413696 |