Machine learning for mathematical question difficulty classification

This project is an experimental study on how machine learning models can be used for classification of GCE ‘A’ Level mathematical questions. Two levels of classification are carried out. First, the classification of questions to their respective topics and second, the classification of the questions...

Full description

Saved in:
Bibliographic Details
Main Author: Pang, Jarald Qi Kai
Other Authors: Hui Siu Cheung
Format: Final Year Project
Language:English
Published: 2019
Subjects:
Online Access:http://hdl.handle.net/10356/76982
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This project is an experimental study on how machine learning models can be used for classification of GCE ‘A’ Level mathematical questions. Two levels of classification are carried out. First, the classification of questions to their respective topics and second, the classification of the questions to their difficulty level. The report will contain detailed explanations of the steps gone through during the experiment. The grading metrics used in this experiment are F1 Score, Precision, Recall and Accuracy. For data pre-processing three text vectorization methods, count vector, word level TF-IDF and N-gram level TF-IDF were used and tested. Four machine learning methods, Support Vector Machines, Naïve Bayes, Random Forest and Extreme Gradient Boosting, were then used to classify the data to their respective topic. Analysis was then done on the models’ performance on each topic. The same 4 machine learning methods were then again used to classify the difficulty of each question using the vectorized question and predicted topic. A final analysis was then done on the performance of the models in difficulty classification.