Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set

One of the common e-commerce problems is the low purchase conversion rate. Data mining techniques can help tackle the problem by analysing and predicting the customer purchase intention to give better service and better recommendations to customers. In this project, the realtime online shoppers purc...

Full description

Saved in:
Bibliographic Details
Main Author: Yap, Chau Tean
Format: Final Year Project / Dissertation / Thesis
Published: 2022
Subjects:
Online Access:http://eprints.utar.edu.my/4990/1/YAP_CHAU_TEAN_2000681.pdf
http://eprints.utar.edu.my/4990/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Tunku Abdul Rahman
Description
Summary:One of the common e-commerce problems is the low purchase conversion rate. Data mining techniques can help tackle the problem by analysing and predicting the customer purchase intention to give better service and better recommendations to customers. In this project, the realtime online shoppers purchasing intention data set from Sakar et al. (2018) was used. The data set is unbalanced as it consists of 15.5% of the positive class and 84.5% of the negative class. Weka, a data mining tool, provides the facility to classify the data set with different machine learning algorithms. Six machine learning algorithms were applied and compared based on the classification evaluation methods. The algorithms involved were K-Nearest Neighbor (KNN), Naïve Bayers, J48, Support Vector Machine (SVM), Sequential Minimal Optimization (SMO) and Multilayer Perceptron (MLP). Data pre-processing on the data set may improve the classification results. The methods used were over-sampling, under-sampling and hybrid sampling, which modified the data set class distribution to achieve a better result. The hybrid sampling method gave comparable classification results compared with Sakar et al. (2018). Ensemble learning methods AdaBoost and Bagging were tested but showed no improvement on this online shoppers purchasing intention data set.