Data driven determination of retail banking customer credit worthiness

Credit risk assessment is a major and critical challenge in the finance industry, as accurate and precise predictions of customer creditworthiness can impact lending decisions and financial stability. This project will explore the application of machine learning models to predict credit risk, lever...

Full description

Saved in:
Bibliographic Details
Main Author: Wang, Christabell Weiqi
Other Authors: Wong Kin Shun, Terence
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181758
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Credit risk assessment is a major and critical challenge in the finance industry, as accurate and precise predictions of customer creditworthiness can impact lending decisions and financial stability. This project will explore the application of machine learning models to predict credit risk, leveraging imbalanced datasets and feature engineering techniques to improve performance of the models. Our research compares traditional classifiers such as Logistic Regression and Decision Trees with more advanced ensemble models like Random Forest, LightGBM, and XGBoost. To address the heavy imbalance in the dataset, Synthetic Minority Oversampling Technique (SMOTE) was applied to the training data. Furthermore, SelectKBest was employed for feature selection based on statistical relevance. The results indicate that ensemble models outperform traditional classifiers, achieving higher precision, recall, and ROC-AUC scores. A final ensemble model combining the strengths of Random Forest, LightGBM, and XGBoost was used to achieve optimal performance. The study highlights the importance of balancing datasets, meaningful feature engineering, and thorough evaluation metrics for robust credit risk analysis. By integrating domain knowledge with machine learning techniques, this research demonstrates the potential for improving credit risk prediction accuracy while reducing the reliance on purely heuristic methods. The findings contribute to advancing machine learning applications in the financial domain, offering a foundation for future research on integrating advanced models and addressing ethical considerations in automated lending systems. This work emphasizes the transformative potential of machine learning in credit risk assessment, paving the way for more efficient and equitable financial decision-making processes.