Improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining

Among the problems raised in the data mining area, the class imbalance is a well-known issue that always occurs. Many researchers studied this issue in several fields using three commonly used techniques: sampling, ensemble, or cost-sensitive learning. However, such studies are still new in educatio...

Full description

Saved in:
Bibliographic Details
Main Authors: Hassan, H., Ahmad, N. B., Anuar, S.
Format: Conference or Workshop Item
Language:English
Published: 2020
Subjects:
Online Access:http://eprints.utm.my/id/eprint/93715/1/HasnizaHassan2020_ImprovedStudentsPerformancePrediction.pdf
http://eprints.utm.my/id/eprint/93715/
http://dx.doi.org/10.1088/1742-6596/1529/5/052041
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
id my.utm.93715
record_format eprints
spelling my.utm.937152021-12-31T08:28:30Z http://eprints.utm.my/id/eprint/93715/ Improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining Hassan, H. Ahmad, N. B. Anuar, S. QC Physics Among the problems raised in the data mining area, the class imbalance is a well-known issue that always occurs. Many researchers studied this issue in several fields using three commonly used techniques: sampling, ensemble, or cost-sensitive learning. However, such studies are still new in education domains. This problem always related to the quality of data that gives the most impact to form an accurate prediction result. Many previous studies focus on binary imbalance classification problems instead of the multi-class imbalance problem in education data. This study used 4413 student instances of two datasets; students' information system and e-learning from the Faculty of Engineering in a Malaysia university for First Semester 2017/2018. Three sampling categories utilized in this study are oversampling techniques, undersampling techniques, and hybrid techniques. The research empirically analyzes five types of ensemble classifiers and seven sampling techniques. The experimental results show a hybrid technique ROS with AdaBoost produces the most excellent performance compared to the other benchmark techniques. SMOTEENN technique with ensembles classifiers consistently produces high results. This technique has great potential in improving the students' performance prediction model. 2020 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/93715/1/HasnizaHassan2020_ImprovedStudentsPerformancePrediction.pdf Hassan, H. and Ahmad, N. B. and Anuar, S. (2020) Improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining. In: 2nd Joint International Conference on Emerging Computing Technology and Sports, JICETS 2019, 25-27 Nov 2019, Bandung, Indonesia. http://dx.doi.org/10.1088/1742-6596/1529/5/052041
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QC Physics
spellingShingle QC Physics
Hassan, H.
Ahmad, N. B.
Anuar, S.
Improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining
description Among the problems raised in the data mining area, the class imbalance is a well-known issue that always occurs. Many researchers studied this issue in several fields using three commonly used techniques: sampling, ensemble, or cost-sensitive learning. However, such studies are still new in education domains. This problem always related to the quality of data that gives the most impact to form an accurate prediction result. Many previous studies focus on binary imbalance classification problems instead of the multi-class imbalance problem in education data. This study used 4413 student instances of two datasets; students' information system and e-learning from the Faculty of Engineering in a Malaysia university for First Semester 2017/2018. Three sampling categories utilized in this study are oversampling techniques, undersampling techniques, and hybrid techniques. The research empirically analyzes five types of ensemble classifiers and seven sampling techniques. The experimental results show a hybrid technique ROS with AdaBoost produces the most excellent performance compared to the other benchmark techniques. SMOTEENN technique with ensembles classifiers consistently produces high results. This technique has great potential in improving the students' performance prediction model.
format Conference or Workshop Item
author Hassan, H.
Ahmad, N. B.
Anuar, S.
author_facet Hassan, H.
Ahmad, N. B.
Anuar, S.
author_sort Hassan, H.
title Improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining
title_short Improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining
title_full Improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining
title_fullStr Improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining
title_full_unstemmed Improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining
title_sort improved students' performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining
publishDate 2020
url http://eprints.utm.my/id/eprint/93715/1/HasnizaHassan2020_ImprovedStudentsPerformancePrediction.pdf
http://eprints.utm.my/id/eprint/93715/
http://dx.doi.org/10.1088/1742-6596/1529/5/052041
_version_ 1720980114513068032