She elicits requirements and he tests: Software engineering gender bias in large language models
Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2023
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8865 https://ink.library.smu.edu.sg/context/sis_research/article/9868/viewcontent/genderbias.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9868 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-98682024-06-13T09:10:36Z She elicits requirements and he tests: Software engineering gender bias in large language models TREUDE, Christoph HATA, Hideaki Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embedded in large language models. We systematically translated each task from English into a genderless language and back, and investigated the pronouns associated with each task. Based on translating each task 100 times in different permutations, we identify a significant disparity in the gendered pronoun associations with different tasks. Specifically, requirements elicitation was associated with the pronoun “he” in only 6% of cases, while testing was associated with “he” in 100% of cases. Additionally, tasks related to helping others had a 91% association with “he” while the same association for tasks related to asking coworkers was only 52%. These findings reveal a clear pattern of gender bias related to software development tasks and have important implications for addressing this issue both in the training of large language models and in broader society. 2023-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8865 info:doi/10.1109/MSR59073.2023.00088 https://ink.library.smu.edu.sg/context/sis_research/article/9868/viewcontent/genderbias.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University gender bias large language models software engineering Programming Languages and Compilers Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
gender bias large language models software engineering Programming Languages and Compilers Software Engineering |
spellingShingle |
gender bias large language models software engineering Programming Languages and Compilers Software Engineering TREUDE, Christoph HATA, Hideaki She elicits requirements and he tests: Software engineering gender bias in large language models |
description |
Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embedded in large language models. We systematically translated each task from English into a genderless language and back, and investigated the pronouns associated with each task. Based on translating each task 100 times in different permutations, we identify a significant disparity in the gendered pronoun associations with different tasks. Specifically, requirements elicitation was associated with the pronoun “he” in only 6% of cases, while testing was associated with “he” in 100% of cases. Additionally, tasks related to helping others had a 91% association with “he” while the same association for tasks related to asking coworkers was only 52%. These findings reveal a clear pattern of gender bias related to software development tasks and have important implications for addressing this issue both in the training of large language models and in broader society. |
format |
text |
author |
TREUDE, Christoph HATA, Hideaki |
author_facet |
TREUDE, Christoph HATA, Hideaki |
author_sort |
TREUDE, Christoph |
title |
She elicits requirements and he tests: Software engineering gender bias in large language models |
title_short |
She elicits requirements and he tests: Software engineering gender bias in large language models |
title_full |
She elicits requirements and he tests: Software engineering gender bias in large language models |
title_fullStr |
She elicits requirements and he tests: Software engineering gender bias in large language models |
title_full_unstemmed |
She elicits requirements and he tests: Software engineering gender bias in large language models |
title_sort |
she elicits requirements and he tests: software engineering gender bias in large language models |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2023 |
url |
https://ink.library.smu.edu.sg/sis_research/8865 https://ink.library.smu.edu.sg/context/sis_research/article/9868/viewcontent/genderbias.pdf |
_version_ |
1814047600968466432 |