Exploring neural network approaches in automatic personality recognition of Filipino twitter users

The field of Automatic Personality Recognition (APR) is steady growing in its goal to determine and understand personality traits. Many studies that work on text data have looked at different sources of data, feature extraction techniques, and machine learning techniques. More recently, studies have...

Full description

Saved in:
Bibliographic Details
Main Authors: Tighe, Edward P., Aran, Oyan, Cheng, Charibeth K.
Format: text
Published: Animo Repository 2020
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/13395
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-15151
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-151512024-11-11T08:21:02Z Exploring neural network approaches in automatic personality recognition of Filipino twitter users Tighe, Edward P. Aran, Oyan Cheng, Charibeth K. The field of Automatic Personality Recognition (APR) is steady growing in its goal to determine and understand personality traits. Many studies that work on text data have looked at different sources of data, feature extraction techniques, and machine learning techniques. More recently, studies have gravitated towards utilizing neural network (NN) based approaches and compare against traditional learning techniques. This presents an opportunity to explore the usefulness of NNs in APR when dealing with Filipino Twitter users. Filipino Twitter users typically write in both English and Tagalog. As it mixes high- and low-resource languages, certain approaches centered on high-resource languages might not be able to fully capture personality information. In our work, we performed APR on a dataset of 250 Filipino Twitter users and focused on Openness and Conscientiousness only. We explore (1) different multilayer perceptron (MLP) configurations fed by term-frequency inverse-document-frequency values, and (2) the usage of trained and pre-trained word embeddings (English and Tagalog) as features to be fed into the best identified MLPs configurations. Findings show that none of the models for Openness performed well – with the best model having a 𝑅2 value of 0.0211. As for Conscientiousness, a TFIDF-fed five hidden layer (128 units each) MLP performed best having a RSME of 0.3344 and 𝑅2 of 0.2799 (an increase of 0.11 over the baseline). MLPs that were trained using word embeddings, regardless of being trained or pre-trained, did not perform very well, as simple MLPs using Binary of TFIDF features performed better. 2020-03-01T08:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/13395 Faculty Research Work Animo Repository Neural networks (Computer science) Natural language processing (Computer science) Social media—Data processing Computer Sciences
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
topic Neural networks (Computer science)
Natural language processing (Computer science)
Social media—Data processing
Computer Sciences
spellingShingle Neural networks (Computer science)
Natural language processing (Computer science)
Social media—Data processing
Computer Sciences
Tighe, Edward P.
Aran, Oyan
Cheng, Charibeth K.
Exploring neural network approaches in automatic personality recognition of Filipino twitter users
description The field of Automatic Personality Recognition (APR) is steady growing in its goal to determine and understand personality traits. Many studies that work on text data have looked at different sources of data, feature extraction techniques, and machine learning techniques. More recently, studies have gravitated towards utilizing neural network (NN) based approaches and compare against traditional learning techniques. This presents an opportunity to explore the usefulness of NNs in APR when dealing with Filipino Twitter users. Filipino Twitter users typically write in both English and Tagalog. As it mixes high- and low-resource languages, certain approaches centered on high-resource languages might not be able to fully capture personality information. In our work, we performed APR on a dataset of 250 Filipino Twitter users and focused on Openness and Conscientiousness only. We explore (1) different multilayer perceptron (MLP) configurations fed by term-frequency inverse-document-frequency values, and (2) the usage of trained and pre-trained word embeddings (English and Tagalog) as features to be fed into the best identified MLPs configurations. Findings show that none of the models for Openness performed well – with the best model having a 𝑅2 value of 0.0211. As for Conscientiousness, a TFIDF-fed five hidden layer (128 units each) MLP performed best having a RSME of 0.3344 and 𝑅2 of 0.2799 (an increase of 0.11 over the baseline). MLPs that were trained using word embeddings, regardless of being trained or pre-trained, did not perform very well, as simple MLPs using Binary of TFIDF features performed better.
format text
author Tighe, Edward P.
Aran, Oyan
Cheng, Charibeth K.
author_facet Tighe, Edward P.
Aran, Oyan
Cheng, Charibeth K.
author_sort Tighe, Edward P.
title Exploring neural network approaches in automatic personality recognition of Filipino twitter users
title_short Exploring neural network approaches in automatic personality recognition of Filipino twitter users
title_full Exploring neural network approaches in automatic personality recognition of Filipino twitter users
title_fullStr Exploring neural network approaches in automatic personality recognition of Filipino twitter users
title_full_unstemmed Exploring neural network approaches in automatic personality recognition of Filipino twitter users
title_sort exploring neural network approaches in automatic personality recognition of filipino twitter users
publisher Animo Repository
publishDate 2020
url https://animorepository.dlsu.edu.ph/faculty_research/13395
_version_ 1816861348383948800