POTENTIAL LEADS SELECTION SYSTEM DESIGN FOR CROSS-SELLING & UPSELLING IN PT X ACADEMY SERIES SERVICES USING DATA MINING MODEL
PT X is a startup that focuses on education technology, providing tutoring services including mentorship for studying abroad and tutoring for other languages. One of the biggest contributors to PT X’s revenue are cross-selling and upselling activities between the services that compliment each oth...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/68664 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | PT X is a startup that focuses on education technology, providing tutoring services including
mentorship for studying abroad and tutoring for other languages. One of the biggest
contributors to PT X’s revenue are cross-selling and upselling activities between the services
that compliment each other in PT X. However, these activities decreased by 64.28% in the
early period of 2021. The root cause of this issue is because the sales of PT X feel that
upselling and cross-selling takes up a lot of time due to the needs of contacting every
customer, but do not yield high returns. Not all customers are willing to buy more products.
To solve this problem, this study will focus on creating a data mining model to predict the
potential of leads that will upsell or cross-sell in one of the most prominent services for
cross-selling and upselling, academy series for language learning. After the model is
completed, a potential leads selection system is constructed in the form of a simple prototype
application that may be used by sales to predict the potential of the customers, hence
increasing the effectivity and efficiency of the whole cross-selling and upselling process.
This study refers to the CRISP-DM methodology which is one of the most popular data
mining workflow. In it, there are a few steps which includes business understanding, data
understanding, data preparation, modelling, evaluation, and deployment.
5 alternative models were used which are support vector machine (SVM), random forest
(RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LGBM), and
voting classifier. Afterwards, alternatives of imbalance learning strategies were
implemented which includes SMOTENC, SVMSMOTE, BorderlineSMOTE (BLSMOTE),
and RandomOverSampler (ROS). It is concluded that RF with ROS achieved the best
performance with an average of 83.83% F1-Score and 80.13% accuracy that may support
upselling and cross-selling activities to increase revenue generation by 4 times compared to
the conditions recorded on this study.
|
---|