IMPLEMENTATION ANALYSIS OF IN-PROCESSING ALGORITHMS THAT MEET CRITERIA OF MONOTONIC SELECTIVE RISK ON FAIRNESS AND ACCURACY IN MACHINE LEARNING

The tradeoff between fairness and accuracy is a common issue in machine learning model development, particularly when the data used contains biases. Models that prioritize accuracy often achieve optimal results in terms of predictions but frequently sacrifice fairness. The primary challenge in th...

Full description

Saved in:
Bibliographic Details
Main Author: Pradipta, Nayotama
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/86174
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:The tradeoff between fairness and accuracy is a common issue in machine learning model development, particularly when the data used contains biases. Models that prioritize accuracy often achieve optimal results in terms of predictions but frequently sacrifice fairness. The primary challenge in this context is finding a way to create models that are not only accurate but also fair, especially in real-world application like recruitment systems. This work explores several in-processing algorithms designed to balance the tradeoff between fairness and accuracy. These algorithms focus on selective regression models, which use the variation in predictions as a measure of the model’s confidence. The evaluated methods include ensemble selective regression, fairness under unawareness, and heteroskedastic neural networks with a sufficiency-based regularizer. Each model is assessed based on metrics including monotonic selective risk, accuracy, and fairness. The findings indicate that the heteroskedastic model with sufficiency-based regularizer delivers excellent performance in both fairness and accuracy. This model successfully reduces RMSE while maintaining ideal levels of independence, separation, and sufficiency. Future research could expand on this work by testing on larger recruitment datasets with more diverse sensitive attributes.