Investigating some classification models and applying in bankruptcy prediction

The recent years have seen much discussion of machine intelligence and how this means for human’s health, productivity and wellbeing. In such discussion, machine learning has demonstrated its increasingly important role regards to human’ fundamental need in present and its power of prediction of the...

Full description

Saved in:
Bibliographic Details
Main Author: Tran, Thi Lan Phuong
Other Authors: Tran, Duc Quynh
Format: Final Year Project
Language:English
Published: 2020
Subjects:
Online Access:http://repository.vnu.edu.vn/handle/VNU_123/95352
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Vietnam National University, Hanoi
Language: English
id oai:112.137.131.14:VNU_123-95352
record_format dspace
spelling oai:112.137.131.14:VNU_123-953522020-10-27T03:53:09Z Investigating some classification models and applying in bankruptcy prediction Tran, Thi Lan Phuong Tran, Duc Quynh ĐHQGHN - Khoa Quốc tế Machine learning Random Forest Bagging Gradient Boosting SMOTE The recent years have seen much discussion of machine intelligence and how this means for human’s health, productivity and wellbeing. In such discussion, machine learning has demonstrated its increasingly important role regards to human’ fundamental need in present and its power of prediction of the events in the future. Besides, bankruptcy has being a concerned problem due to its negative effects to economy and wellbeing. This problem is out of control. Therefore, research of bankruptcy prediction using machine learning is necessary and practical at the moment. The purpose of the research is to study some classification models and then identify the best predictive model that can be applied to the task of bankruptcy prediction. In this document, the models being studied are decision tree, random forest, bagging and gradient boosting. The idea, architecture, operation and the characteristics of each model are also explored. Furthermore, the Polish companies’ bankruptcy dataset have been chosen to support for the project. It is beginning by analyzing and assessing the dataset quality. Next, the dataset will be preprocessed by using random forest algorithm to impute missing values and Synthetic Minority Oversampling Technique (SMOTE) to balance two target labels in the dataset. Then, models will be applied to the processed dataset to find out the best performance model. Last but not least, K fold cross validation method is also applied to evaluate the model performance. The project uses Python as the programming language, Spyder as a cross-platform integrated development environment and Tableau, Microsoft Excel as visualization tool Management of Information system 2020-10-27T03:50:11Z 2020-10-27T03:50:11Z 2020 Final Year Project (FYP) http://repository.vnu.edu.vn/handle/VNU_123/95352 TR-P en 46 p. application/pdf
institution Vietnam National University, Hanoi
building VNU Library & Information Center
continent Asia
country Vietnam
Vietnam
content_provider VNU Library and Information Center
collection VNU Digital Repository
language English
topic Machine learning
Random Forest
Bagging
Gradient Boosting
SMOTE
spellingShingle Machine learning
Random Forest
Bagging
Gradient Boosting
SMOTE
Tran, Thi Lan Phuong
Investigating some classification models and applying in bankruptcy prediction
description The recent years have seen much discussion of machine intelligence and how this means for human’s health, productivity and wellbeing. In such discussion, machine learning has demonstrated its increasingly important role regards to human’ fundamental need in present and its power of prediction of the events in the future. Besides, bankruptcy has being a concerned problem due to its negative effects to economy and wellbeing. This problem is out of control. Therefore, research of bankruptcy prediction using machine learning is necessary and practical at the moment. The purpose of the research is to study some classification models and then identify the best predictive model that can be applied to the task of bankruptcy prediction. In this document, the models being studied are decision tree, random forest, bagging and gradient boosting. The idea, architecture, operation and the characteristics of each model are also explored. Furthermore, the Polish companies’ bankruptcy dataset have been chosen to support for the project. It is beginning by analyzing and assessing the dataset quality. Next, the dataset will be preprocessed by using random forest algorithm to impute missing values and Synthetic Minority Oversampling Technique (SMOTE) to balance two target labels in the dataset. Then, models will be applied to the processed dataset to find out the best performance model. Last but not least, K fold cross validation method is also applied to evaluate the model performance. The project uses Python as the programming language, Spyder as a cross-platform integrated development environment and Tableau, Microsoft Excel as visualization tool
author2 Tran, Duc Quynh
author_facet Tran, Duc Quynh
Tran, Thi Lan Phuong
format Final Year Project
author Tran, Thi Lan Phuong
author_sort Tran, Thi Lan Phuong
title Investigating some classification models and applying in bankruptcy prediction
title_short Investigating some classification models and applying in bankruptcy prediction
title_full Investigating some classification models and applying in bankruptcy prediction
title_fullStr Investigating some classification models and applying in bankruptcy prediction
title_full_unstemmed Investigating some classification models and applying in bankruptcy prediction
title_sort investigating some classification models and applying in bankruptcy prediction
publishDate 2020
url http://repository.vnu.edu.vn/handle/VNU_123/95352
_version_ 1681763325792223232