Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review

In IoT environment applications generate continuous non-stationary data streams with in-built problems of concept drift and class imbalance which cause classifier performance degradation. The imbalanced data affects the classifier during concept detection and concept adaptation. In general, for conc...

Full description

Saved in:
Bibliographic Details
Main Authors: Palli, Abdul Sattar, Jaafar, Jafreezal, Gilal, Abdul Rehman, Alsughayyir, Aeshah, Gomes, Heitor Murilo, Alshanqiti, Abdullah, Omar, Mazni
Format: Article
Language:English
Published: Universiti Utara Malaysia Press 2024
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/30350/1/JICT%2023%2001%202024%20105-139.pdf
https://doi.org/10.32890/jict2024.23.1.5
https://repo.uum.edu.my/id/eprint/30350/
https://e-journal.uum.edu.my/index.php/jict/article/view/20733
https://doi.org/10.32890/jict2024.23.1.5
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Utara Malaysia
Language: English
id my.uum.repo.30350
record_format eprints
spelling my.uum.repo.303502024-02-01T14:01:55Z https://repo.uum.edu.my/id/eprint/30350/ Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review Palli, Abdul Sattar Jaafar, Jafreezal Gilal, Abdul Rehman Alsughayyir, Aeshah Gomes, Heitor Murilo Alshanqiti, Abdullah Omar, Mazni QA75 Electronic computers. Computer science In IoT environment applications generate continuous non-stationary data streams with in-built problems of concept drift and class imbalance which cause classifier performance degradation. The imbalanced data affects the classifier during concept detection and concept adaptation. In general, for concept detection, a separate mechanism is added in parallel with the classifier to detect the concept drift called a drift detector. For concept adaptation, the classifier updates itself or trains a new classifier to replace the older one. In case, the data stream faces a class imbalance issue, the classifier may not properly adapt to the latest concept. In this survey, we study how the existing work addresses the issues of class imbalance and concept drift while learning from nonstationary data streams. We further highlight the limitation of existing work and challenges caused by other factors of class imbalance along with concept drift in data stream classification. Results of our survey found that, out of 1110 studies, by using our inclusion and exclusion criteria, we were able to narrow the pool of articles down to 35 that directly addressed our study objectives. The study found that issues such as multiple concept drift types, dynamic class imbalance ratio, and multi-class imbalance in presence of concept drift are still open for further research. We also observed that, while major research efforts have been dedicated to resolving concept drift and class imbalance, not much attention has been given to with-in-class imbalance, rear examples, and borderline instances when they exist with concept drift in multi-class data. This paper concludes with some suggested future directions. Universiti Utara Malaysia Press 2024 Article PeerReviewed application/pdf en cc4_by https://repo.uum.edu.my/id/eprint/30350/1/JICT%2023%2001%202024%20105-139.pdf Palli, Abdul Sattar and Jaafar, Jafreezal and Gilal, Abdul Rehman and Alsughayyir, Aeshah and Gomes, Heitor Murilo and Alshanqiti, Abdullah and Omar, Mazni (2024) Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review. Journal of Information and Communication Technology, 23 (1). pp. 105-139. ISSN 2180-3862 https://e-journal.uum.edu.my/index.php/jict/article/view/20733 https://doi.org/10.32890/jict2024.23.1.5 https://doi.org/10.32890/jict2024.23.1.5
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutional Repository
url_provider http://repo.uum.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Palli, Abdul Sattar
Jaafar, Jafreezal
Gilal, Abdul Rehman
Alsughayyir, Aeshah
Gomes, Heitor Murilo
Alshanqiti, Abdullah
Omar, Mazni
Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review
description In IoT environment applications generate continuous non-stationary data streams with in-built problems of concept drift and class imbalance which cause classifier performance degradation. The imbalanced data affects the classifier during concept detection and concept adaptation. In general, for concept detection, a separate mechanism is added in parallel with the classifier to detect the concept drift called a drift detector. For concept adaptation, the classifier updates itself or trains a new classifier to replace the older one. In case, the data stream faces a class imbalance issue, the classifier may not properly adapt to the latest concept. In this survey, we study how the existing work addresses the issues of class imbalance and concept drift while learning from nonstationary data streams. We further highlight the limitation of existing work and challenges caused by other factors of class imbalance along with concept drift in data stream classification. Results of our survey found that, out of 1110 studies, by using our inclusion and exclusion criteria, we were able to narrow the pool of articles down to 35 that directly addressed our study objectives. The study found that issues such as multiple concept drift types, dynamic class imbalance ratio, and multi-class imbalance in presence of concept drift are still open for further research. We also observed that, while major research efforts have been dedicated to resolving concept drift and class imbalance, not much attention has been given to with-in-class imbalance, rear examples, and borderline instances when they exist with concept drift in multi-class data. This paper concludes with some suggested future directions.
format Article
author Palli, Abdul Sattar
Jaafar, Jafreezal
Gilal, Abdul Rehman
Alsughayyir, Aeshah
Gomes, Heitor Murilo
Alshanqiti, Abdullah
Omar, Mazni
author_facet Palli, Abdul Sattar
Jaafar, Jafreezal
Gilal, Abdul Rehman
Alsughayyir, Aeshah
Gomes, Heitor Murilo
Alshanqiti, Abdullah
Omar, Mazni
author_sort Palli, Abdul Sattar
title Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review
title_short Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review
title_full Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review
title_fullStr Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review
title_full_unstemmed Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review
title_sort online machine learning from non-stationary data streams in the presence of concept drift and class imbalance: a systematic review
publisher Universiti Utara Malaysia Press
publishDate 2024
url https://repo.uum.edu.my/id/eprint/30350/1/JICT%2023%2001%202024%20105-139.pdf
https://doi.org/10.32890/jict2024.23.1.5
https://repo.uum.edu.my/id/eprint/30350/
https://e-journal.uum.edu.my/index.php/jict/article/view/20733
https://doi.org/10.32890/jict2024.23.1.5
_version_ 1789943850653974528