Modeling personality traits of Filipino twitter users based on linguistic markers

There have been multiple studies that correlate a persons writing style and personality traits. With the power of machine learning, this eventually led to the rise of computational text-based personality trait recognition. The eld is constantly growing as it started from analyzing personal essays an...

Full description

Saved in:
Bibliographic Details
Main Author: Tighe, Edward P.
Format: text
Language:English
Published: Animo Repository 2017
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_masteral/5330
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_masteral-12168
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_masteral-121682024-09-13T01:54:45Z Modeling personality traits of Filipino twitter users based on linguistic markers Tighe, Edward P. There have been multiple studies that correlate a persons writing style and personality traits. With the power of machine learning, this eventually led to the rise of computational text-based personality trait recognition. The eld is constantly growing as it started from analyzing personal essays and is currently exploring the enormous amount of data available from social networking sites such as Facebook or Twitter. Current studies have shifted from analyzing English to analyzing non-English languages; however, the eld still lacks in three areas: (1) analysis of the Filipino Language, (2) analysis of Filipinos, or a group of individuals, word choice, and (3) analysis of the output of feature reduction techniques. This research has addressed each of these concerns by collecting and processing the Tweets of 288 Filipino Twitter users. A language independent approach was implemented to handle the multiple languages that could be spoken by individuals. Computational model were then created for each of the personality traits of the Five Factor Model. Findings show that Conscientiousness is the easiest trait to model (F1 = 0.8251; = 0.6499), while the model for Openness is the hardest (F1 = 0.6194; = 0.2414). Analysis also showed that 1-grams are sucient to model traits for all of the Big Five, except for Extraversion that utilized 1, 2, and 3-grams. This research also analyzed feature-reduced datasets used by each traits top performing models to identify the composition of the set of features. Findings show that there are 11 LIWC2015 categories that are common amongst all of the Big Five such as Active Processes, Positive Emotion, and Informal Language. 2017-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/5330 Master's Theses English Animo Repository Natural language processing (Computer science) Information filtering systems Machine learning Personality Online social networks
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
topic Natural language processing (Computer science)
Information filtering systems
Machine learning
Personality
Online social networks
spellingShingle Natural language processing (Computer science)
Information filtering systems
Machine learning
Personality
Online social networks
Tighe, Edward P.
Modeling personality traits of Filipino twitter users based on linguistic markers
description There have been multiple studies that correlate a persons writing style and personality traits. With the power of machine learning, this eventually led to the rise of computational text-based personality trait recognition. The eld is constantly growing as it started from analyzing personal essays and is currently exploring the enormous amount of data available from social networking sites such as Facebook or Twitter. Current studies have shifted from analyzing English to analyzing non-English languages; however, the eld still lacks in three areas: (1) analysis of the Filipino Language, (2) analysis of Filipinos, or a group of individuals, word choice, and (3) analysis of the output of feature reduction techniques. This research has addressed each of these concerns by collecting and processing the Tweets of 288 Filipino Twitter users. A language independent approach was implemented to handle the multiple languages that could be spoken by individuals. Computational model were then created for each of the personality traits of the Five Factor Model. Findings show that Conscientiousness is the easiest trait to model (F1 = 0.8251; = 0.6499), while the model for Openness is the hardest (F1 = 0.6194; = 0.2414). Analysis also showed that 1-grams are sucient to model traits for all of the Big Five, except for Extraversion that utilized 1, 2, and 3-grams. This research also analyzed feature-reduced datasets used by each traits top performing models to identify the composition of the set of features. Findings show that there are 11 LIWC2015 categories that are common amongst all of the Big Five such as Active Processes, Positive Emotion, and Informal Language.
format text
author Tighe, Edward P.
author_facet Tighe, Edward P.
author_sort Tighe, Edward P.
title Modeling personality traits of Filipino twitter users based on linguistic markers
title_short Modeling personality traits of Filipino twitter users based on linguistic markers
title_full Modeling personality traits of Filipino twitter users based on linguistic markers
title_fullStr Modeling personality traits of Filipino twitter users based on linguistic markers
title_full_unstemmed Modeling personality traits of Filipino twitter users based on linguistic markers
title_sort modeling personality traits of filipino twitter users based on linguistic markers
publisher Animo Repository
publishDate 2017
url https://animorepository.dlsu.edu.ph/etd_masteral/5330
_version_ 1811611555871064064