Bias problems in large language models and how to mitigate them

Pretrained Language Models (PLMs) like ChatGPT have become integral to various industries, revolutionising applications from customer service to software development. However, these PLMs are often trained on vast, unmoderated datasets, which may contain social biases that can be propagated in the m...

Full description

Saved in:

Bibliographic Details
Main Author:	Ong, Adrian Zhi Ying
Other Authors:	Luu Anh Tuan
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Bias Large language model
Online Access:	https://hdl.handle.net/10356/181163
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-181163
record_format	dspace
spelling	sg-ntu-dr.10356-1811632024-11-18T01:16:32Z Bias problems in large language models and how to mitigate them Ong, Adrian Zhi Ying Luu Anh Tuan College of Computing and Data Science anhtuan.luu@ntu.edu.sg Computer and Information Science Bias Large language model Pretrained Language Models (PLMs) like ChatGPT have become integral to various industries, revolutionising applications from customer service to software development. However, these PLMs are often trained on vast, unmoderated datasets, which may contain social biases that can be propagated in the models' outputs. This study evaluates the effectiveness of five debiasing techniques: Self-Debias, Counterfactual Data Augmentation (CDA), SentenceDebias, Iterative Nullspace Projection (INLP), and Dropout regularization on three autoregressive large parameter models: GPT-2, Phi-2, and Llama-2. Focusing on bias categories: gender, race, and religion, specifically in U.S. and Singapore contexts, leveraging on established bias benchmarking datasets: CrowS-Pairs and StereoSet. The study found that Self-Debias is the most effective bias mitigation strategy, consistently reducing bias across all tested scenarios. However, with potential significant trade-offs in model performance in downstream tasks. Bias mitigation is more effective in the U.S. as compared to Singapore datasets primarily due to the scarcity of Singapore context training data used as training data. The study emphasizes the complexity of bias mitigation, highlighting the need for careful assessment in balancing the trade-off between bias reduction and model performance, as well as the importance of curating context-specific datasets. Finally, concluding with practical recommendations for future research and industry applications. Bachelor's degree 2024-11-18T01:15:59Z 2024-11-18T01:15:59Z 2024 Final Year Project (FYP) Ong, A. Z. Y. (2024). Bias problems in large language models and how to mitigate them. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181163 https://hdl.handle.net/10356/181163 en SCSE23-1077 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Bias Large language model
spellingShingle	Computer and Information Science Bias Large language model Ong, Adrian Zhi Ying Bias problems in large language models and how to mitigate them
description	Pretrained Language Models (PLMs) like ChatGPT have become integral to various industries, revolutionising applications from customer service to software development. However, these PLMs are often trained on vast, unmoderated datasets, which may contain social biases that can be propagated in the models' outputs. This study evaluates the effectiveness of five debiasing techniques: Self-Debias, Counterfactual Data Augmentation (CDA), SentenceDebias, Iterative Nullspace Projection (INLP), and Dropout regularization on three autoregressive large parameter models: GPT-2, Phi-2, and Llama-2. Focusing on bias categories: gender, race, and religion, specifically in U.S. and Singapore contexts, leveraging on established bias benchmarking datasets: CrowS-Pairs and StereoSet. The study found that Self-Debias is the most effective bias mitigation strategy, consistently reducing bias across all tested scenarios. However, with potential significant trade-offs in model performance in downstream tasks. Bias mitigation is more effective in the U.S. as compared to Singapore datasets primarily due to the scarcity of Singapore context training data used as training data. The study emphasizes the complexity of bias mitigation, highlighting the need for careful assessment in balancing the trade-off between bias reduction and model performance, as well as the importance of curating context-specific datasets. Finally, concluding with practical recommendations for future research and industry applications.
author2	Luu Anh Tuan
author_facet	Luu Anh Tuan Ong, Adrian Zhi Ying
format	Final Year Project
author	Ong, Adrian Zhi Ying
author_sort	Ong, Adrian Zhi Ying
title	Bias problems in large language models and how to mitigate them
title_short	Bias problems in large language models and how to mitigate them
title_full	Bias problems in large language models and how to mitigate them
title_fullStr	Bias problems in large language models and how to mitigate them
title_full_unstemmed	Bias problems in large language models and how to mitigate them
title_sort	bias problems in large language models and how to mitigate them
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/181163
_version_	1816858994080219136

Bias problems in large language models and how to mitigate them

Similar Items