Determining the credit worthiness of retail banking customer by machine learning technique

This research project conducts a comparative analysis of statistical and machine learning models for credit risk assessment, focusing on their performance in the context of imbalanced datasets common in loan default prediction. Traditional statistical models and advanced machine learning algorith...

全面介紹

Saved in:
書目詳細資料
主要作者: Peng, Yangling
其他作者: Wong Kin Shun, Terence
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2024
主題:
在線閱讀:https://hdl.handle.net/10356/176996
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:This research project conducts a comparative analysis of statistical and machine learning models for credit risk assessment, focusing on their performance in the context of imbalanced datasets common in loan default prediction. Traditional statistical models and advanced machine learning algorithms, including Decision Trees, Support Vector Machines (SVMs), and K-Nearest Neighbors (KNN), were evaluated to determine the most effective approach for predicting creditworthiness. Machine learning models outperformed their statistical counterparts, adeptly identifying complex, non-linear patterns and adjusting to data imbalance. Resampling techniques, especially AdaptiveSMOTE, enhanced the predictive accuracy of distance-based models such as KNN and SVM by ensuring a balanced class distribution, crucial for minority class prediction. Statistical models faced challenges with the skewed original dataset, which were mitigated by implementing SMOTE and AdaptiveSMOTE to balance class representation. These resampling strategies proved critical in improving the models' prediction reliability. The study addresses the acute challenge in banking: the scarcity of default data. The improved analytical methods bolstered the banks' capacity for precise credit risk management, thereby reinforcing the financial sector's defenses against potential defaults.