Sentiment Analysis of Code-Switched Filipino-English Product and Service Reviews Using Transformers-Based Large Language Models

Bilingual individuals already outnumber monolinguals yet most of the available resources for research in natural language processing (NLP) are for high-resource single languages. A recent area of interest in NLP research for low-resource languages is code-switching, a phenomenon in both written and...

Full description

Saved in:

Bibliographic Details
Main Authors:	Cosme, Camilla Johnine, De Leon, Marlene M.
Format:	text
Published:	Archīum Ateneo 2024
Subjects:	Code-switching Natural language processing Online reviews Sentiment analysis Transformers Computer Sciences Databases and Information Systems Physical Sciences and Mathematics
Online Access:	https://archium.ateneo.edu/discs-faculty-pubs/411 https://doi.org/10.1007/978-981-99-8349-0_11
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Ateneo De Manila University

id	ph-ateneo-arc.discs-faculty-pubs-1411
record_format	eprints
spelling	ph-ateneo-arc.discs-faculty-pubs-14112024-04-15T08:20:18Z Sentiment Analysis of Code-Switched Filipino-English Product and Service Reviews Using Transformers-Based Large Language Models Cosme, Camilla Johnine De Leon, Marlene M. Bilingual individuals already outnumber monolinguals yet most of the available resources for research in natural language processing (NLP) are for high-resource single languages. A recent area of interest in NLP research for low-resource languages is code-switching, a phenomenon in both written and spoken communication marked by the usage of at least two languages in one utterance. This work presented two novel contributions to NLP research for low-resource languages. First, it introduced the first sentiment-annotated corpus of Filipino-English Reviews with Code-Switching (FiReCS) with more than 10k instances of product and service reviews. Second, it developed sentiment analysis models for Filipino-English text using pre-trained Transformers-based large language models (LLMs) and introduced benchmark results for zero-shot sentiment analysis on text with code-switching using OpenAI’s GPT-3 series models. The performance of the Transformers-based sentiment analysis models were compared against those of existing lexicon-based sentiment analysis tools designed for monolingual text. The fine-tuned XLM-RoBERTa model achieved the highest accuracy and weighted average F1-score of 0.84 with F1-scores of 0.89, 0.86, and 0.78 in the Positive, Negative, and Neutral sentiment classes, respectively. The poor performance of the lexicon-based sentiment analysis tools exemplifies the limitations of such systems that are designed for a single language when applied to bilingual text involving code-switching. 2024-01-01T08:00:00Z text https://archium.ateneo.edu/discs-faculty-pubs/411 https://doi.org/10.1007/978-981-99-8349-0_11 Department of Information Systems & Computer Science Faculty Publications Archīum Ateneo Code-switching Natural language processing Online reviews Sentiment analysis Transformers Computer Sciences Databases and Information Systems Physical Sciences and Mathematics
institution	Ateneo De Manila University
building	Ateneo De Manila University Library
continent	Asia
country	Philippines Philippines
content_provider	Ateneo De Manila University Library
collection	archium.Ateneo Institutional Repository
topic	Code-switching Natural language processing Online reviews Sentiment analysis Transformers Computer Sciences Databases and Information Systems Physical Sciences and Mathematics
spellingShingle	Code-switching Natural language processing Online reviews Sentiment analysis Transformers Computer Sciences Databases and Information Systems Physical Sciences and Mathematics Cosme, Camilla Johnine De Leon, Marlene M. Sentiment Analysis of Code-Switched Filipino-English Product and Service Reviews Using Transformers-Based Large Language Models
description	Bilingual individuals already outnumber monolinguals yet most of the available resources for research in natural language processing (NLP) are for high-resource single languages. A recent area of interest in NLP research for low-resource languages is code-switching, a phenomenon in both written and spoken communication marked by the usage of at least two languages in one utterance. This work presented two novel contributions to NLP research for low-resource languages. First, it introduced the first sentiment-annotated corpus of Filipino-English Reviews with Code-Switching (FiReCS) with more than 10k instances of product and service reviews. Second, it developed sentiment analysis models for Filipino-English text using pre-trained Transformers-based large language models (LLMs) and introduced benchmark results for zero-shot sentiment analysis on text with code-switching using OpenAI’s GPT-3 series models. The performance of the Transformers-based sentiment analysis models were compared against those of existing lexicon-based sentiment analysis tools designed for monolingual text. The fine-tuned XLM-RoBERTa model achieved the highest accuracy and weighted average F1-score of 0.84 with F1-scores of 0.89, 0.86, and 0.78 in the Positive, Negative, and Neutral sentiment classes, respectively. The poor performance of the lexicon-based sentiment analysis tools exemplifies the limitations of such systems that are designed for a single language when applied to bilingual text involving code-switching.
format	text
author	Cosme, Camilla Johnine De Leon, Marlene M.
author_facet	Cosme, Camilla Johnine De Leon, Marlene M.
author_sort	Cosme, Camilla Johnine
title	Sentiment Analysis of Code-Switched Filipino-English Product and Service Reviews Using Transformers-Based Large Language Models
title_short	Sentiment Analysis of Code-Switched Filipino-English Product and Service Reviews Using Transformers-Based Large Language Models
title_full	Sentiment Analysis of Code-Switched Filipino-English Product and Service Reviews Using Transformers-Based Large Language Models
title_fullStr	Sentiment Analysis of Code-Switched Filipino-English Product and Service Reviews Using Transformers-Based Large Language Models
title_full_unstemmed	Sentiment Analysis of Code-Switched Filipino-English Product and Service Reviews Using Transformers-Based Large Language Models
title_sort	sentiment analysis of code-switched filipino-english product and service reviews using transformers-based large language models
publisher	Archīum Ateneo
publishDate	2024
url	https://archium.ateneo.edu/discs-faculty-pubs/411 https://doi.org/10.1007/978-981-99-8349-0_11
_version_	1797546527389908992

Sentiment Analysis of Code-Switched Filipino-English Product and Service Reviews Using Transformers-Based Large Language Models

Similar Items