SOCIAL MEDIA DATA TO DETERMINE LOAN DEFAULT PREDICTING METHOD IN AN ONLINE PEER TO PEER LENDING

Currently, financial technology is growing rapidly in Indonesia. One of the major type of financial technology is online peer to peer lending platform. However, peer to peer lending is still exposed by credit risk as its major concern. There is one online peer to peer lending company in Indonesia th...

Full description

Saved in:
Bibliographic Details
Main Author: Nabila Laila Khilfah, Hasna
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/38675
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Currently, financial technology is growing rapidly in Indonesia. One of the major type of financial technology is online peer to peer lending platform. However, peer to peer lending is still exposed by credit risk as its major concern. There is one online peer to peer lending company in Indonesia that faced a fluctuation on its non-performing financing; bad debt ratio. In order to address the issue, the company try to develop social media assessment within their credit scoring model. Therefore, the research is aimed to identify social media variable that could be used as default probability predictor and to determine the predictability level by added social media data to the model. Previous studies showed that social media or network could increase the repayment prediction by 18% (Tan & Phan, 2016), while Chen, et.al.(2016) showed that the combination of demographic, social media, and network features could outperform the model consisted only demographic data as much as 3.16%. It is showed that social media data could increase the predictability level. Furthermore, this research will use a combination of credit scorecard and logistic regression as loan default predicting method since previous research (Tsai, 2009) showed that the method could perform well with fewer variables, so that it would be more efficient to focus only for significant variables in credit scoring model. Meanwhile, the independent variable consisted of six social media variables (months usage duration in Instagram, posting frequency in midnight in Facebook, number of religion accounts followed in Instagram, Followers, Following, and number of Instagram posts per month) with seven control variables consisted of two historical payment variables (tenor and installment amount) and five demographic variables (Gender, Marital, District, Employment, and Income monthly). The research finds five variables could be considered and used as default probability predictor which are Employment, Tenor, Posting Frequency in Midnight, Followers, and Following. Furthermore, model with selected variables with combination of demographic, historical payment, and social media data could increase the predictability level as much as 8.9% compared to model that only uses demographic and historical payment variables. Therefore, it is recommended for the company to focus with the five variables and expand their social media assessment to develop their credit scoring model. The company has to be aware by give low score for the credit applicant who employed with low average salary, assigned for credit duration more than six months, have lower followers, have higher following, and frequently posted in social media when midnight. Meanwhile, for the regulator and supervisor (Bank Indonesia and Otoritas Jasa Keuangan), social media data might be considered to improve credit scoring application. However, user privacy still has to be considered to implied social media assessment. For the future research, it is better to use more variable especially in social media aspect, it could be sentiment from status posted by the user and the time range is much better to be longer than one year to get more than 100 numbers of data in order to get more better model accuracy and predictability level.