Evaluating Gender Bias in Pre-trained Filipino FastText Embeddings

Past studies show that word embeddings can learn gender biases introduced by human agents into the textual corpora used to train these models. However, it has also been shown that some non-English embeddings may actually not capture such biases in their word representations. This study, therefore, a...

Full description

Saved in:
Bibliographic Details
Main Authors: Gamboa, Lance Calvin, Estuar, Ma. Regina Justina
Format: text
Published: Archīum Ateneo 2023
Subjects:
Online Access:https://archium.ateneo.edu/discs-faculty-pubs/379
https://doi.org/10.1109/ITIKD56332.2023.10100022
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Ateneo De Manila University
id ph-ateneo-arc.discs-faculty-pubs-1379
record_format eprints
spelling ph-ateneo-arc.discs-faculty-pubs-13792024-02-21T03:19:14Z Evaluating Gender Bias in Pre-trained Filipino FastText Embeddings Gamboa, Lance Calvin Estuar, Ma. Regina Justina Past studies show that word embeddings can learn gender biases introduced by human agents into the textual corpora used to train these models. However, it has also been shown that some non-English embeddings may actually not capture such biases in their word representations. This study, therefore, aimed to answer the question: Does the publicly available Filipino FastText word embedding contain gender bias? Various iterations of the Word Embedding Association Test and principal component analysis were conducted on the embedding to answer this question. Results show that the Tagalog FastText embedding not only represents gendered semantic information properly but also captures biases about masculinity and femininity collectively held by Filipinos. Specifically, the embedding most strongly associates the female with nouns pertaining to domestic and caregiving roles and the male with verbs relating to strength and their bodies. The study's findings can help determine what next steps need to be undertaken to reduce or eliminate bias from Filipino embeddings. 2023-01-01T08:00:00Z text https://archium.ateneo.edu/discs-faculty-pubs/379 https://doi.org/10.1109/ITIKD56332.2023.10100022 Department of Information Systems & Computer Science Faculty Publications Archīum Ateneo FastText Filipino gender bias principal component analysis word embedding Word Embedding Association Test Computer Engineering Electrical and Computer Engineering Engineering
institution Ateneo De Manila University
building Ateneo De Manila University Library
continent Asia
country Philippines
Philippines
content_provider Ateneo De Manila University Library
collection archium.Ateneo Institutional Repository
topic FastText
Filipino
gender bias
principal component analysis
word embedding
Word Embedding Association Test
Computer Engineering
Electrical and Computer Engineering
Engineering
spellingShingle FastText
Filipino
gender bias
principal component analysis
word embedding
Word Embedding Association Test
Computer Engineering
Electrical and Computer Engineering
Engineering
Gamboa, Lance Calvin
Estuar, Ma. Regina Justina
Evaluating Gender Bias in Pre-trained Filipino FastText Embeddings
description Past studies show that word embeddings can learn gender biases introduced by human agents into the textual corpora used to train these models. However, it has also been shown that some non-English embeddings may actually not capture such biases in their word representations. This study, therefore, aimed to answer the question: Does the publicly available Filipino FastText word embedding contain gender bias? Various iterations of the Word Embedding Association Test and principal component analysis were conducted on the embedding to answer this question. Results show that the Tagalog FastText embedding not only represents gendered semantic information properly but also captures biases about masculinity and femininity collectively held by Filipinos. Specifically, the embedding most strongly associates the female with nouns pertaining to domestic and caregiving roles and the male with verbs relating to strength and their bodies. The study's findings can help determine what next steps need to be undertaken to reduce or eliminate bias from Filipino embeddings.
format text
author Gamboa, Lance Calvin
Estuar, Ma. Regina Justina
author_facet Gamboa, Lance Calvin
Estuar, Ma. Regina Justina
author_sort Gamboa, Lance Calvin
title Evaluating Gender Bias in Pre-trained Filipino FastText Embeddings
title_short Evaluating Gender Bias in Pre-trained Filipino FastText Embeddings
title_full Evaluating Gender Bias in Pre-trained Filipino FastText Embeddings
title_fullStr Evaluating Gender Bias in Pre-trained Filipino FastText Embeddings
title_full_unstemmed Evaluating Gender Bias in Pre-trained Filipino FastText Embeddings
title_sort evaluating gender bias in pre-trained filipino fasttext embeddings
publisher Archīum Ateneo
publishDate 2023
url https://archium.ateneo.edu/discs-faculty-pubs/379
https://doi.org/10.1109/ITIKD56332.2023.10100022
_version_ 1792202614223405056