A PREDICTION OF A HEALTH INSURANCE PREMIUM USING A GENERALIZED LINEAR MODEL (GLM) AND GRADIENT BOOSTING MACHINE (GBM)

An insurance business is a business which handles a risk transfer from an insured (policyholder) to an insurer (insurance company). As a compensation for the transfer of risk, a policyholder is required to pay an insurance premium. However, there is a level of uncertainty in the amount of premium...

Full description

Saved in:
Bibliographic Details
Main Author: Satria Joel Manurung, Tito
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/68974
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:68974
spelling id-itb.:689742022-09-19T19:17:07ZA PREDICTION OF A HEALTH INSURANCE PREMIUM USING A GENERALIZED LINEAR MODEL (GLM) AND GRADIENT BOOSTING MACHINE (GBM) Satria Joel Manurung, Tito Indonesia Final Project insurance premium, Generalized Linear Model (GLM), Gradient Boosting Machine (GBM) INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/68974 An insurance business is a business which handles a risk transfer from an insured (policyholder) to an insurer (insurance company). As a compensation for the transfer of risk, a policyholder is required to pay an insurance premium. However, there is a level of uncertainty in the amount of premium which must be paid by the policyholder because the frequency and severity of claims are not known with certainty at the time the premium need to be paid. In this final project, two methodologies are used to determine the amount of insurance premium which must be paid by a policyholder for a general insurance product. The first methodology is a regression model called a Generalized Linear Model (GLM). In GLM, there is an assumption that the distribution of the response variable must follow a distribution in the exponential family. The second, is the Gradient Boosting Machine (GBM) which does not require any assumptions on the probability distribution of the response variable. In this final project, a dataset on a health insurance in the United States is used, obtained from Kaggle.com. The premium variable in the data, which is the response variable, follows a Tweedie distribution. Based on that probability model, the natural logarithm link function is used in the GLM. The second methodology, the GBM, considers 4 hyperparameters: shrinkage, interaction.depth, minobsinnode, and n.trees. The RMSE value in the test set is used to compare the two methodologies. It was found that the RMSE produced by the GBM is smaller than that produced by GLM. This means that, based on the data analyzed, the GBM is better in predicting the amount of insurance premiums than those predicted by GLM. However, it should be noted that the results produced by a GLM is more interpretive than those produced by a GBM. Hence, a GLM is still widely used in modeling a general insurance data. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description An insurance business is a business which handles a risk transfer from an insured (policyholder) to an insurer (insurance company). As a compensation for the transfer of risk, a policyholder is required to pay an insurance premium. However, there is a level of uncertainty in the amount of premium which must be paid by the policyholder because the frequency and severity of claims are not known with certainty at the time the premium need to be paid. In this final project, two methodologies are used to determine the amount of insurance premium which must be paid by a policyholder for a general insurance product. The first methodology is a regression model called a Generalized Linear Model (GLM). In GLM, there is an assumption that the distribution of the response variable must follow a distribution in the exponential family. The second, is the Gradient Boosting Machine (GBM) which does not require any assumptions on the probability distribution of the response variable. In this final project, a dataset on a health insurance in the United States is used, obtained from Kaggle.com. The premium variable in the data, which is the response variable, follows a Tweedie distribution. Based on that probability model, the natural logarithm link function is used in the GLM. The second methodology, the GBM, considers 4 hyperparameters: shrinkage, interaction.depth, minobsinnode, and n.trees. The RMSE value in the test set is used to compare the two methodologies. It was found that the RMSE produced by the GBM is smaller than that produced by GLM. This means that, based on the data analyzed, the GBM is better in predicting the amount of insurance premiums than those predicted by GLM. However, it should be noted that the results produced by a GLM is more interpretive than those produced by a GBM. Hence, a GLM is still widely used in modeling a general insurance data.
format Final Project
author Satria Joel Manurung, Tito
spellingShingle Satria Joel Manurung, Tito
A PREDICTION OF A HEALTH INSURANCE PREMIUM USING A GENERALIZED LINEAR MODEL (GLM) AND GRADIENT BOOSTING MACHINE (GBM)
author_facet Satria Joel Manurung, Tito
author_sort Satria Joel Manurung, Tito
title A PREDICTION OF A HEALTH INSURANCE PREMIUM USING A GENERALIZED LINEAR MODEL (GLM) AND GRADIENT BOOSTING MACHINE (GBM)
title_short A PREDICTION OF A HEALTH INSURANCE PREMIUM USING A GENERALIZED LINEAR MODEL (GLM) AND GRADIENT BOOSTING MACHINE (GBM)
title_full A PREDICTION OF A HEALTH INSURANCE PREMIUM USING A GENERALIZED LINEAR MODEL (GLM) AND GRADIENT BOOSTING MACHINE (GBM)
title_fullStr A PREDICTION OF A HEALTH INSURANCE PREMIUM USING A GENERALIZED LINEAR MODEL (GLM) AND GRADIENT BOOSTING MACHINE (GBM)
title_full_unstemmed A PREDICTION OF A HEALTH INSURANCE PREMIUM USING A GENERALIZED LINEAR MODEL (GLM) AND GRADIENT BOOSTING MACHINE (GBM)
title_sort prediction of a health insurance premium using a generalized linear model (glm) and gradient boosting machine (gbm)
url https://digilib.itb.ac.id/gdl/view/68974
_version_ 1822990755901210624