Survey on highly imbalanced multi-class data

Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providi...

Full description

Saved in:
Bibliographic Details
Main Authors: Abdul Hamid, Mohd Hakim, Yusoff, Marina, Mohamed, Azlinah
Format: Article
Language:English
Published: The Science and Information Organization 2022
Online Access:http://eprints.utem.edu.my/id/eprint/26188/2/PAPER_27-SURVEY_ON_HIGHLY_IMBALANCED_MULTI_CLASS_DATA.PDF
http://eprints.utem.edu.my/id/eprint/26188/
https://thesai.org/Downloads/Volume13No6/Paper_27-Survey_on_Highly_Imbalanced_Multi_class_Data.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknikal Malaysia Melaka
Language: English
id my.utem.eprints.26188
record_format eprints
spelling my.utem.eprints.261882023-02-10T15:15:54Z http://eprints.utem.edu.my/id/eprint/26188/ Survey on highly imbalanced multi-class data Abdul Hamid, Mohd Hakim Yusoff, Marina Mohamed, Azlinah Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providing unbiased solutions. From the earliest literatures published on highly imbalanced data until recently, machine learning research has focused mostly on binary classification data problems. Research on highly imbalanced multi-class data is still greatly unexplored when the need for better analysis and predictions in handling Big Data is required. This study focuses on reviews related to the models or techniques in handling highly imbalanced multi-class data, along with their strengths and weaknesses and related domains. Furthermore, the paper uses the statistical method to explore a case study with a severely imbalanced dataset. This article aims to (1) understand the trend of highly imbalanced multi-class data through analysis of related literatures; (2) analyze the previous and current methods of handling highly imbalanced multi-class data; (3) construct a framework of highly imbalanced multi-class data. The chosen highly imbalanced multi-class dataset analysis will also be performed and adapted to the current methods or techniques in machine learning, followed by discussions on open challenges and the future direction of highly imbalanced multi-class data. Finally, for highly imbalanced multi-class data, this paper presents a novel framework. We hope this research can provide insights on the potential development of better methods or techniques to handle and manipulate highly imbalanced multi-class data. The Science and Information Organization 2022 Article PeerReviewed text en http://eprints.utem.edu.my/id/eprint/26188/2/PAPER_27-SURVEY_ON_HIGHLY_IMBALANCED_MULTI_CLASS_DATA.PDF Abdul Hamid, Mohd Hakim and Yusoff, Marina and Mohamed, Azlinah (2022) Survey on highly imbalanced multi-class data. (IJACSA) INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 13 (6). pp. 211-229. ISSN 2156-5570 https://thesai.org/Downloads/Volume13No6/Paper_27-Survey_on_Highly_Imbalanced_Multi_class_Data.pdf 10.14569/IJACSA.2022.0130627
institution Universiti Teknikal Malaysia Melaka
building UTEM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknikal Malaysia Melaka
content_source UTEM Institutional Repository
url_provider http://eprints.utem.edu.my/
language English
description Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providing unbiased solutions. From the earliest literatures published on highly imbalanced data until recently, machine learning research has focused mostly on binary classification data problems. Research on highly imbalanced multi-class data is still greatly unexplored when the need for better analysis and predictions in handling Big Data is required. This study focuses on reviews related to the models or techniques in handling highly imbalanced multi-class data, along with their strengths and weaknesses and related domains. Furthermore, the paper uses the statistical method to explore a case study with a severely imbalanced dataset. This article aims to (1) understand the trend of highly imbalanced multi-class data through analysis of related literatures; (2) analyze the previous and current methods of handling highly imbalanced multi-class data; (3) construct a framework of highly imbalanced multi-class data. The chosen highly imbalanced multi-class dataset analysis will also be performed and adapted to the current methods or techniques in machine learning, followed by discussions on open challenges and the future direction of highly imbalanced multi-class data. Finally, for highly imbalanced multi-class data, this paper presents a novel framework. We hope this research can provide insights on the potential development of better methods or techniques to handle and manipulate highly imbalanced multi-class data.
format Article
author Abdul Hamid, Mohd Hakim
Yusoff, Marina
Mohamed, Azlinah
spellingShingle Abdul Hamid, Mohd Hakim
Yusoff, Marina
Mohamed, Azlinah
Survey on highly imbalanced multi-class data
author_facet Abdul Hamid, Mohd Hakim
Yusoff, Marina
Mohamed, Azlinah
author_sort Abdul Hamid, Mohd Hakim
title Survey on highly imbalanced multi-class data
title_short Survey on highly imbalanced multi-class data
title_full Survey on highly imbalanced multi-class data
title_fullStr Survey on highly imbalanced multi-class data
title_full_unstemmed Survey on highly imbalanced multi-class data
title_sort survey on highly imbalanced multi-class data
publisher The Science and Information Organization
publishDate 2022
url http://eprints.utem.edu.my/id/eprint/26188/2/PAPER_27-SURVEY_ON_HIGHLY_IMBALANCED_MULTI_CLASS_DATA.PDF
http://eprints.utem.edu.my/id/eprint/26188/
https://thesai.org/Downloads/Volume13No6/Paper_27-Survey_on_Highly_Imbalanced_Multi_class_Data.pdf
_version_ 1758582071488413696