PREPROCESSING TECHNIQUES FOR HANDLING DISCRIMINATION IN BINARY LABEL DATASETS

Data Mining is a process for gaining pattern and knowledge from data (Han etc., 2012). This process can help users because it can be used as a consideration to determine the next business steps. However, the prediction results are not 100% reliable. One of the reasons is the possibility of unfair...

Full description

Saved in:

Bibliographic Details
Main Author:	Zabrina Pramata, Nella
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/51429
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:51429
spelling	id-itb.:514292020-09-28T17:51:34ZPREPROCESSING TECHNIQUES FOR HANDLING DISCRIMINATION IN BINARY LABEL DATASETS Zabrina Pramata, Nella Indonesia Final Project Fairness AI preprocessing techniques, discrimination, Binary Label Dataset INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/51429 Data Mining is a process for gaining pattern and knowledge from data (Han etc., 2012). This process can help users because it can be used as a consideration to determine the next business steps. However, the prediction results are not 100% reliable. One of the reasons is the possibility of unfairness in the results of the prediction made by the model. The unfairness in the results could occur because the training data that is used for training process contains sensitive information. The pattern obtained from the training process is influenced by sensitive information which could potentially cause discrimination against that sensitive information. That way, this result is likely to harm the specific groups of people due to discrimination. The Artificial Intelligence system that could potentially discriminate against sensitive information is called Unfair Artificial Intelligence which will henceforth be abbreviated as Unfair AI. Several ways have been proposed by various researchers to handle unfair AI in the form of techniques in the preprocessing, inprocessing, and postprocessing stages. In this research, the Fairness AI techniques to handle discrimination is focused on the preprocessing techniques so that sensitive attributes in the dataset can be handled before the training step. As for the dataset used, only focuses on Binary Label Dataset. The Fairness AI preprocessing techniques used in this research consisted of Uniform Sampling, Preferential Sampling, Preferential Sampling, Massaging the Dataset, Reweighing, Suppression, and four modified techniques of Suppression. Based on the results of this study, it is known that Uniform Sampling, Massaging the Dataset, and Reweighing techniques tend to reduce the level of discrimination. However, the other six Fairness AI preprocessing techniques can still be used to reduce the level of fairness even though the results are not always effective. Based on these results, there is no most suitable technique for all datasets so that these nine techniques still need to be used to find out which technique is suitable for the dataset to be used. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Data Mining is a process for gaining pattern and knowledge from data (Han etc., 2012). This process can help users because it can be used as a consideration to determine the next business steps. However, the prediction results are not 100% reliable. One of the reasons is the possibility of unfairness in the results of the prediction made by the model. The unfairness in the results could occur because the training data that is used for training process contains sensitive information. The pattern obtained from the training process is influenced by sensitive information which could potentially cause discrimination against that sensitive information. That way, this result is likely to harm the specific groups of people due to discrimination. The Artificial Intelligence system that could potentially discriminate against sensitive information is called Unfair Artificial Intelligence which will henceforth be abbreviated as Unfair AI. Several ways have been proposed by various researchers to handle unfair AI in the form of techniques in the preprocessing, inprocessing, and postprocessing stages. In this research, the Fairness AI techniques to handle discrimination is focused on the preprocessing techniques so that sensitive attributes in the dataset can be handled before the training step. As for the dataset used, only focuses on Binary Label Dataset. The Fairness AI preprocessing techniques used in this research consisted of Uniform Sampling, Preferential Sampling, Preferential Sampling, Massaging the Dataset, Reweighing, Suppression, and four modified techniques of Suppression. Based on the results of this study, it is known that Uniform Sampling, Massaging the Dataset, and Reweighing techniques tend to reduce the level of discrimination. However, the other six Fairness AI preprocessing techniques can still be used to reduce the level of fairness even though the results are not always effective. Based on these results, there is no most suitable technique for all datasets so that these nine techniques still need to be used to find out which technique is suitable for the dataset to be used.
format	Final Project
author	Zabrina Pramata, Nella
spellingShingle	Zabrina Pramata, Nella PREPROCESSING TECHNIQUES FOR HANDLING DISCRIMINATION IN BINARY LABEL DATASETS
author_facet	Zabrina Pramata, Nella
author_sort	Zabrina Pramata, Nella
title	PREPROCESSING TECHNIQUES FOR HANDLING DISCRIMINATION IN BINARY LABEL DATASETS
title_short	PREPROCESSING TECHNIQUES FOR HANDLING DISCRIMINATION IN BINARY LABEL DATASETS
title_full	PREPROCESSING TECHNIQUES FOR HANDLING DISCRIMINATION IN BINARY LABEL DATASETS
title_fullStr	PREPROCESSING TECHNIQUES FOR HANDLING DISCRIMINATION IN BINARY LABEL DATASETS
title_full_unstemmed	PREPROCESSING TECHNIQUES FOR HANDLING DISCRIMINATION IN BINARY LABEL DATASETS
title_sort	preprocessing techniques for handling discrimination in binary label datasets
url	https://digilib.itb.ac.id/gdl/view/51429
_version_	1822928737518223360

PREPROCESSING TECHNIQUES FOR HANDLING DISCRIMINATION IN BINARY LABEL DATASETS

Similar Items