Towards a bilingual sentiment analysis model for English and Filipino

There is an opportunity to learn and understand how Filipinos think, behave and react online, especially in responding to significant events. Resources, such as lexicons and corpora or a combination in a target language, as well as selection machine learning classifiers may be used to address this o...

Full description

Saved in:

Bibliographic Details
Main Author:	MARLENE, DE LEON
Format:	text
Published:	Archīum Ateneo 2013
Subjects:	Computational linguistics > Case studies Corpora (Linguistics) Code switching (Linguistics) Artifical intelligence Computer Engineering
Online Access:	https://archium.ateneo.edu/theses-dissertations/226 http://rizalls.lib.admu.edu.ph/#section=resource&resourceid=234946739&currentIndex=0&view=fullDetailsDetailsTab
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Ateneo De Manila University

id	ph-ateneo-arc.theses-dissertations-1352
record_format	eprints
spelling	ph-ateneo-arc.theses-dissertations-13522021-07-06T02:19:47Z Towards a bilingual sentiment analysis model for English and Filipino MARLENE, DE LEON There is an opportunity to learn and understand how Filipinos think, behave and react online, especially in responding to significant events. Resources, such as lexicons and corpora or a combination in a target language, as well as selection machine learning classifiers may be used to address this opportunity. However, there is little work on bilingual conversations. Filipino Tweets provide a rich source of data for building corpora and model for this kind of classification as it is composed of a mixture of English and mostly Filipino terms. This study looked into building bilingual sentiment analysis models for classifying bilingual English and Filipino disaster tweets. The study applied a supervised learning approach for subjective and sentiment models using Support Vector Machine (SVM), Na?ve Bayes, and K-Nearest Neighbor (K-NN) and bilingual English and Filipino lexicon, corpora and a combination in fixed distribution sets, in creating bilingual English and Filipino sentiment analysis models. Accuracy, precision, recall and F-measure were used to evaluate the performance of the models. Each of the resulting models were further evaluated against manually annotated corpora of tweets to determine its performance and reliability. For the bilingual subjective classification model, performance was highest in Nave Bayes, using the combination of lexicon and corpora, at 95% objective-5% subjective imbalanced distribution, with F measure of 73.53%. Similarly, the bilingual sentiment classification model performed highest in Na?ve Bayes, using the combination of lexicon and corpora, at 95% positive-5% negative, with F measure of 72.41%. The study showed that for English-Filipino sentiments, bilingual classification works best with an imbalanced distribution scheme and combination of lexicon and corpora data sets. PCA was performed further on the resulting positive and negative sentiments to obtain manifest constructs on sentiments. Results showed a promising possibility of extending the bilingual sentiment classification model further to include specific positive and negative emotions. 2013-01-01T08:00:00Z text https://archium.ateneo.edu/theses-dissertations/226 http://rizalls.lib.admu.edu.ph/#section=resource&resourceid=234946739&currentIndex=0&view=fullDetailsDetailsTab Theses and Dissertations (All) Archīum Ateneo Computational linguistics -- Case studies Corpora (Linguistics) Code switching (Linguistics) Artifical intelligence Computer Engineering
institution	Ateneo De Manila University
building	Ateneo De Manila University Library
continent	Asia
country	Philippines Philippines
content_provider	Ateneo De Manila University Library
collection	archium.Ateneo Institutional Repository
topic	Computational linguistics -- Case studies Corpora (Linguistics) Code switching (Linguistics) Artifical intelligence Computer Engineering
spellingShingle	Computational linguistics -- Case studies Corpora (Linguistics) Code switching (Linguistics) Artifical intelligence Computer Engineering MARLENE, DE LEON Towards a bilingual sentiment analysis model for English and Filipino
description	There is an opportunity to learn and understand how Filipinos think, behave and react online, especially in responding to significant events. Resources, such as lexicons and corpora or a combination in a target language, as well as selection machine learning classifiers may be used to address this opportunity. However, there is little work on bilingual conversations. Filipino Tweets provide a rich source of data for building corpora and model for this kind of classification as it is composed of a mixture of English and mostly Filipino terms. This study looked into building bilingual sentiment analysis models for classifying bilingual English and Filipino disaster tweets. The study applied a supervised learning approach for subjective and sentiment models using Support Vector Machine (SVM), Na?ve Bayes, and K-Nearest Neighbor (K-NN) and bilingual English and Filipino lexicon, corpora and a combination in fixed distribution sets, in creating bilingual English and Filipino sentiment analysis models. Accuracy, precision, recall and F-measure were used to evaluate the performance of the models. Each of the resulting models were further evaluated against manually annotated corpora of tweets to determine its performance and reliability. For the bilingual subjective classification model, performance was highest in Nave Bayes, using the combination of lexicon and corpora, at 95% objective-5% subjective imbalanced distribution, with F measure of 73.53%. Similarly, the bilingual sentiment classification model performed highest in Na?ve Bayes, using the combination of lexicon and corpora, at 95% positive-5% negative, with F measure of 72.41%. The study showed that for English-Filipino sentiments, bilingual classification works best with an imbalanced distribution scheme and combination of lexicon and corpora data sets. PCA was performed further on the resulting positive and negative sentiments to obtain manifest constructs on sentiments. Results showed a promising possibility of extending the bilingual sentiment classification model further to include specific positive and negative emotions.
format	text
author	MARLENE, DE LEON
author_facet	MARLENE, DE LEON
author_sort	MARLENE, DE LEON
title	Towards a bilingual sentiment analysis model for English and Filipino
title_short	Towards a bilingual sentiment analysis model for English and Filipino
title_full	Towards a bilingual sentiment analysis model for English and Filipino
title_fullStr	Towards a bilingual sentiment analysis model for English and Filipino
title_full_unstemmed	Towards a bilingual sentiment analysis model for English and Filipino
title_sort	towards a bilingual sentiment analysis model for english and filipino
publisher	Archīum Ateneo
publishDate	2013
url	https://archium.ateneo.edu/theses-dissertations/226 http://rizalls.lib.admu.edu.ph/#section=resource&resourceid=234946739&currentIndex=0&view=fullDetailsDetailsTab
_version_	1712577819686469632

Towards a bilingual sentiment analysis model for English and Filipino

Similar Items