IDENTIFICATION OF HEALTH CARE PROVIDER FRAUD USING SUPPORT VECTOR MACHINE

One of the most important problems in the insurance industry is fraud which causes huge losses. Deliberate fraud by hiding or omitting facts when submitting a claim is considered a fraudulent activity in the health insurance sector which causes large losses for insurance companies. Fraudulent act...

Full description

Saved in:

Bibliographic Details
Main Author:	Jeremy
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/81303
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:81303
spelling	id-itb.:813032024-06-12T11:21:21ZIDENTIFICATION OF HEALTH CARE PROVIDER FRAUD USING SUPPORT VECTOR MACHINE Jeremy Indonesia Final Project Fraud Detection, Support Vector Machine, Hyperparameter Optimization, Kernels INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/81303 One of the most important problems in the insurance industry is fraud which causes huge losses. Deliberate fraud by hiding or omitting facts when submitting a claim is considered a fraudulent activity in the health insurance sector which causes large losses for insurance companies. Fraudulent acts are increasingly diverse and the amount of data is also growing, making it quite difficult to recognize fraudulent acts from large data sets. One way to overcome this fraud is to detect it using machine learning. In this research, the machine learning methods used are linear support vector machines and nonlinear support vector machines with radial basis function and sigmoid kernels whose performance will be compared. In building a support vector machine model, there are several parameters that need to be defined. To obtain optimal parameters, a hyperparameter optimization method is needed. In this case, the hyperparameter optimization methods used are grid search, random search and Bayesian optimization. Apart from that, in preparing the data several methods are also needed, namely data normalization, oversampling and feature selection so that the resulting model is more optimal. The data normalization method used is robust scaler while the oversampling method used is SMOTE. Feature selection is one of the important things in machine learning and is often used to carry out dimension reduction by removing irrelevant and redundant information from a data set to obtain an optimal feature subset. The method used to select features is recursive feature elimination (RFE). The best model obtained was the Linear SVM model with 20 features selected using the RFE method and the hyperparameter optimization method used was the Random Search method. This model produces an AUC value on test data of 0.93732 which shows that the model can perform classification very well.. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	One of the most important problems in the insurance industry is fraud which causes huge losses. Deliberate fraud by hiding or omitting facts when submitting a claim is considered a fraudulent activity in the health insurance sector which causes large losses for insurance companies. Fraudulent acts are increasingly diverse and the amount of data is also growing, making it quite difficult to recognize fraudulent acts from large data sets. One way to overcome this fraud is to detect it using machine learning. In this research, the machine learning methods used are linear support vector machines and nonlinear support vector machines with radial basis function and sigmoid kernels whose performance will be compared. In building a support vector machine model, there are several parameters that need to be defined. To obtain optimal parameters, a hyperparameter optimization method is needed. In this case, the hyperparameter optimization methods used are grid search, random search and Bayesian optimization. Apart from that, in preparing the data several methods are also needed, namely data normalization, oversampling and feature selection so that the resulting model is more optimal. The data normalization method used is robust scaler while the oversampling method used is SMOTE. Feature selection is one of the important things in machine learning and is often used to carry out dimension reduction by removing irrelevant and redundant information from a data set to obtain an optimal feature subset. The method used to select features is recursive feature elimination (RFE). The best model obtained was the Linear SVM model with 20 features selected using the RFE method and the hyperparameter optimization method used was the Random Search method. This model produces an AUC value on test data of 0.93732 which shows that the model can perform classification very well..
format	Final Project
author	Jeremy
spellingShingle	Jeremy IDENTIFICATION OF HEALTH CARE PROVIDER FRAUD USING SUPPORT VECTOR MACHINE
author_facet	Jeremy
author_sort	Jeremy
title	IDENTIFICATION OF HEALTH CARE PROVIDER FRAUD USING SUPPORT VECTOR MACHINE
title_short	IDENTIFICATION OF HEALTH CARE PROVIDER FRAUD USING SUPPORT VECTOR MACHINE
title_full	IDENTIFICATION OF HEALTH CARE PROVIDER FRAUD USING SUPPORT VECTOR MACHINE
title_fullStr	IDENTIFICATION OF HEALTH CARE PROVIDER FRAUD USING SUPPORT VECTOR MACHINE
title_full_unstemmed	IDENTIFICATION OF HEALTH CARE PROVIDER FRAUD USING SUPPORT VECTOR MACHINE
title_sort	identification of health care provider fraud using support vector machine
url	https://digilib.itb.ac.id/gdl/view/81303
_version_	1822997248620888064

IDENTIFICATION OF HEALTH CARE PROVIDER FRAUD USING SUPPORT VECTOR MACHINE

Similar Items