BIAS HANDLING IN-PROCESSING ALGORITHMS COMPARISON IN MACHINE LEARNING

Algorithmic bias is a form of bias which occurs when mathematical rules favor one set of attributes over others in relation to some target variable, like “approving” or “denying” a loan (Bantilan, 2018). Algorithmic bias surfaces when a trained machine learning model produces a systematic predict...

Full description

Saved in:
Bibliographic Details
Main Author: Ellen
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/49516
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Algorithmic bias is a form of bias which occurs when mathematical rules favor one set of attributes over others in relation to some target variable, like “approving” or “denying” a loan (Bantilan, 2018). Algorithmic bias surfaces when a trained machine learning model produces a systematic prediction which favors a group of attributes with a target variable. In this work, we did an experiment to handle bias by using in-processing algorithms. We use adversarial debiasing, prejudice remover, additive counterfactually fair, and decision boundary fairness measurements. We tested it on COMPAS, Adult Income, German Credit Risk, and Bank Marketing datasets. Then, we did an analysis and compare the results between models with bias handling and models without bias handling. We built a library which contains the implementation for four algorithms mentioned before for ease of use of handling bias with inprocessing algorithms with Python 3. From these findings, we found out that these algorithms can reduce bias, by configuring the right hyperparameters. However, all algorithms that we tested on did not have the same performance for different datasets. Out of these algorithms, we found out that decision boundary type measurements algorithm produces highest significancy in accuracy, F1 score, and bias metrics. Meanwhile, prejudice remover algorithm produces the least significancy for all three metrics mentioned before.