Development of big data analytics tools for e-learning

Background: The purpose of this research is to predict students’ final outcome on a weekly basis, using machines learning algorithms. Data generated from online learning platforms can be analysed and make a prediction. Method: Analysed attributes for significant differences with the use of AVNOA...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Samuel Boon Hao
Other Authors: Chua Hock Chuan
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74683
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Background: The purpose of this research is to predict students’ final outcome on a weekly basis, using machines learning algorithms. Data generated from online learning platforms can be analysed and make a prediction. Method: Analysed attributes for significant differences with the use of AVNOA or Z-test. Classification machine learning models were used namely, K Nearest Neighbor (KNN), Support Vector Classifier (SVC), Gradient Boost Classifier (GBM), Logistic Regression (LR), Decision Tree Classifier (DTC), Naive Bayesian (NB) and Random Forest Classifier (RFC). Prediction for past batch was used to obtain models accuracy score. Result: ‘V’, ‘BG_S’ and ‘C3’ attributes do not have effects on final outcome (P>0.05). Only ‘BG_Cat’, ‘C1’ and ‘C2’ attributes have an effect. (P<0.05) Different classification machine learning models were used as each operates differently. The best model score was KNN with an accuracy score of 77.17%, followed by GBM, RFC, DTC, SVC, LR and NB. (77.17%., 72.83%, 71.74%, 71.74%, 60.87% and 46.74%, respectively.) Conclusion: Relevant data fitted for machine learning will have an impact on accuracy prediction rate and cross-validation can be used for all models to train data in different cases. Accuracy score can be further increased with the use of newer machine learning algorithms. ‘C3’ was not used although it does affect 13% to final, as almost every data points have a value that is close to average value. Feature engineering was used modified for any zero-valued attribute ‘H’, the predicted result should be ‘F’.