STUDI KOMPARASI ALGORITME MACHINE LEARNING DECISION TREE DAN VARIANNYA UNTUK PENILAIAN RISIKO KREDIT PADA PEER TO PEER LENDING
The ease of lending through technology startups is a huge opportunity in Indonesia because only 49% of Indonesia's 264 million citizens have bank accounts. Startups use P2P lending schemes to channel funds to creditors. This causes the funds channeled by P2P lending is not corporate funds but i...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Subjects: | |
Online Access: | https://digilib.itb.ac.id/gdl/view/64896 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | The ease of lending through technology startups is a huge opportunity in Indonesia because only 49% of Indonesia's 264 million citizens have bank accounts. Startups use P2P lending schemes to channel funds to creditors. This causes the funds channeled by P2P lending is not corporate funds but individual investor funds. Unlike banks that have high compliance standards and history transaction, P2P lending startups are limited in providing large loans due to a lack of historical data from prospective creditors. The presence of technology such as machine learning is the solution to assessing the customer's credit application. This is because companies can make models more quickly and accurately using historical data. On this final project, a decision tree-based algorithm, such as CART, Random Forest and XGBoost (Extreme Gradient Boosting) are used to make a model from Lending Club data a P2P Lending company from the United States as a case study. The algorithm is implemented in five stages using the CRISP-DM method, namely understanding business needs, understanding data, preprocessing data, parameter optimization and modeling, and evaluation. In the preprocessing stage, a feature selection using Extratree Classifier. In the parameter optimization and modeling phase, a grid search method is used with k-fold cross-validation. In the evaluation phase, five different metrics are used, namely Accuracy, F1 score, precision, recall, and specificity, and AUC with the AUC metric as the main standard. The results of this study indicate that when applied to the backtesting Lending Club data, each algorithm has its own advantages. The decision tree algorithm is an algorithm with the best AUC performance, which is 0.937893. Besides that Random Forest is superior in four metrics, namely accuracy with a value of 0.955366822, precision with a value of 0.922996255, F1-Measure with a value of 0.882626821 and specifity with a value of 0.982533048. While the XGboost algorithm excels at recall value 0.930603. |
---|