Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches

With the advancement of sentiment analysis (SA) models and their incorporation into our daily lives, fairness testing on these models is crucial, since unfair decisions can cause discrimination to a large population. Nevertheless, some challenges in fairness testing include the unknown oracle, the d...

全面介紹

Saved in:

書目詳細資料
Main Authors:	KHOO, Lin Sze, BAY, Jia Qi, YAP, Ming Lee Kimberly, LIM, Mei Kuan, CHONG, Chun Yong, YANG, Zhou, LO, David
格式:	text
語言:	English
出版:	Institutional Knowledge at Singapore Management University 2023
主題:	Fairness testing Automated program repair Sentiment analysis Software Engineering
在線閱讀:	https://ink.library.smu.edu.sg/sis_research/8514 https://ink.library.smu.edu.sg/context/sis_research/article/9517/viewcontent/Exploring_and_Repairing_Gender_Fairness_Violations_in_Word_Embedding_based_Sentiment_Analysis_Model_through_Adversarial_Patches.pdf
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Singapore Management University
語言:	English

id	sg-smu-ink.sis_research-9517
record_format	dspace
spelling	sg-smu-ink.sis_research-95172024-01-22T15:09:18Z Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches KHOO, Lin Sze BAY, Jia Qi YAP, Ming Lee Kimberly LIM, Mei Kuan CHONG, Chun Yong YANG, Zhou LO, David With the advancement of sentiment analysis (SA) models and their incorporation into our daily lives, fairness testing on these models is crucial, since unfair decisions can cause discrimination to a large population. Nevertheless, some challenges in fairness testing include the unknown oracle, the difficulty in generating suitable test inputs, and the lack of a reliable way of fixing the issues. To fill in these gaps, BiasRV, a tool based on metamorphic testing (MT), was introduced and succeeded in uncovering fairness issues in a transformer-based model. However, the extent of unfairness in other SA models has not been thoroughly investigated. Our work conducts a more comprehensive empirical study to reveal the extent of fairness violations, specifically gender fairness, exhibited by other popular word embedding-based SA models. We define fairness violation as the behavior in which an SA model predicts variants created from a text, which merely differ in gender classes, to have different sentiments. Our inspection utilizing BiasRV uncovers at least 30 fairness violations (at BiasRV's default threshold) in all three SA models. Realizing the importance of addressing such significant violations, we introduce adversarial patches (AP) as a way of patch generation in an automated program repair (APR) system to fix them. We adopt adversarial fine-tuning in AP by retraining SA models using adversarial examples, which are bias-uncovering test cases dynamically generated by a tool named BiasFinder at runtime. Evaluation of the SA models shows that our proposed AP reduces fairness violations by at least 25%. 2023-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8514 info:doi/10.1109/SANER56733.2023.00066 https://ink.library.smu.edu.sg/context/sis_research/article/9517/viewcontent/Exploring_and_Repairing_Gender_Fairness_Violations_in_Word_Embedding_based_Sentiment_Analysis_Model_through_Adversarial_Patches.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Fairness testing Automated program repair Sentiment analysis Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Fairness testing Automated program repair Sentiment analysis Software Engineering
spellingShingle	Fairness testing Automated program repair Sentiment analysis Software Engineering KHOO, Lin Sze BAY, Jia Qi YAP, Ming Lee Kimberly LIM, Mei Kuan CHONG, Chun Yong YANG, Zhou LO, David Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches
description	With the advancement of sentiment analysis (SA) models and their incorporation into our daily lives, fairness testing on these models is crucial, since unfair decisions can cause discrimination to a large population. Nevertheless, some challenges in fairness testing include the unknown oracle, the difficulty in generating suitable test inputs, and the lack of a reliable way of fixing the issues. To fill in these gaps, BiasRV, a tool based on metamorphic testing (MT), was introduced and succeeded in uncovering fairness issues in a transformer-based model. However, the extent of unfairness in other SA models has not been thoroughly investigated. Our work conducts a more comprehensive empirical study to reveal the extent of fairness violations, specifically gender fairness, exhibited by other popular word embedding-based SA models. We define fairness violation as the behavior in which an SA model predicts variants created from a text, which merely differ in gender classes, to have different sentiments. Our inspection utilizing BiasRV uncovers at least 30 fairness violations (at BiasRV's default threshold) in all three SA models. Realizing the importance of addressing such significant violations, we introduce adversarial patches (AP) as a way of patch generation in an automated program repair (APR) system to fix them. We adopt adversarial fine-tuning in AP by retraining SA models using adversarial examples, which are bias-uncovering test cases dynamically generated by a tool named BiasFinder at runtime. Evaluation of the SA models shows that our proposed AP reduces fairness violations by at least 25%.
format	text
author	KHOO, Lin Sze BAY, Jia Qi YAP, Ming Lee Kimberly LIM, Mei Kuan CHONG, Chun Yong YANG, Zhou LO, David
author_facet	KHOO, Lin Sze BAY, Jia Qi YAP, Ming Lee Kimberly LIM, Mei Kuan CHONG, Chun Yong YANG, Zhou LO, David
author_sort	KHOO, Lin Sze
title	Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches
title_short	Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches
title_full	Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches
title_fullStr	Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches
title_full_unstemmed	Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches
title_sort	exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches
publisher	Institutional Knowledge at Singapore Management University
publishDate	2023
url	https://ink.library.smu.edu.sg/sis_research/8514 https://ink.library.smu.edu.sg/context/sis_research/article/9517/viewcontent/Exploring_and_Repairing_Gender_Fairness_Violations_in_Word_Embedding_based_Sentiment_Analysis_Model_through_Adversarial_Patches.pdf
_version_	1789483256822890496

Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches

相似書籍