COMPARING BIAS MITIGATION POST-PROCESSING METHODS IN DISCRIMINATION AWARE CLASSIFIER BUILD PROCESS

A machine learning process is performed by using training data which are collected from the real world. Data collected from the real world may contain bias from humans who collect the data. Bias in data which contain individual or personal information on people may be discriminative. Discriminati...

Full description

Saved in:
Bibliographic Details
Main Author: Leslie, Louis
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/49888
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:A machine learning process is performed by using training data which are collected from the real world. Data collected from the real world may contain bias from humans who collect the data. Bias in data which contain individual or personal information on people may be discriminative. Discrimination can occur against certain minority groups when the biased data is used in decision making. A machine learning process can also produce a discriminative classifier model when performed by using discriminative or biased training data. To prevent the production of these classifiers, bias mitigation methods are performed to build discrimination aware classifiers. Several bias mitigation methods, especially post-processing methods, are discussed in this paper. The post-processing methods discussed are equalized odds postprocessing (Hardt et al., 2016), calibrated equalized odds post-processing (Pleiss et al., 2017) and reject option classification (Kamiran et al., 2012). The three postprocessing methods are tested on classifiers built by using the Adult Dataset, COMPAS Recidivism Dataset, German Dataset and MEPS Dataset. The evaluation results of each post-processing methods are compared and analyzed. From the evaluation results, it is observed that equalized odds post-processing method reduced group bias in classifiers the most significant compared to other post-processing methods. However, when classifiers’ performance decrease is considered, reject option classification is the most robust method. This research can be continued by doing the experiment on more datasets with different sensitive attributes.