ON PERFORMANCE COMPARISON BETWEEN STRONG MACHINE UNLEARNING ALGORITHMS FOR LOGISTIC- BASED CREDIT ASSESSMENT MODELS

The enactment of UU 27/2022 on Personal Data Protection requires financial service institutions as personal data processors to erase debtors’ personal data upon request, which is challenging to do towards trained machine learning models. In order to address this issue, machine unlearning methods...

Full description

Saved in:
Bibliographic Details
Main Author: S.O.N. Simbolon, Jeremy
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/84982
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:The enactment of UU 27/2022 on Personal Data Protection requires financial service institutions as personal data processors to erase debtors’ personal data upon request, which is challenging to do towards trained machine learning models. In order to address this issue, machine unlearning methods have been developed to erase the influence of training data on model weights. In this research, the performance of two strong machine unlearning algorithm implementations, ?-? Certified Removal (CR) and Projective Residual Update (PRU), is compared on logistic-based models developed within the context of credit risk assessment. Credit risk assessment models’ development was performed using debtors data set provided by Indonesian financial service institutions to obtain the best credit models according to hyperparameter tuning results. Machine unlearning algorithms used in the experiment were implemented using l2-regularizer ? ? {0.01, 0.005, 0.001} to erase the influence of k = 10% of training data. Experiment result showed that ?-? CR yielded a model with lower L 2 -distance, higher accuracy, and faster unlearn time compared to PRU for k < 3%, while the opposite was true for k ? 3%. Further research is required to explore the effects of larger training data sets with greater dimensionality on the performance of both algorithms.