An investigation on correlation between linguistic features and personality traits : a case of Singapore Twitter users.

In this exploratory study, twelve linguistic features that correlated with personality traits according to previous studies were selected in an attempt to detect if there are any differences between Singaporeans and Westerners. Thereafter, corpuses were generated from the six million tweets...

Full description

Saved in:
Bibliographic Details
Main Author: Yew, Chee Soon.
Other Authors: Theng Yin Leng
Format: Theses and Dissertations
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/54758
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this exploratory study, twelve linguistic features that correlated with personality traits according to previous studies were selected in an attempt to detect if there are any differences between Singaporeans and Westerners. Thereafter, corpuses were generated from the six million tweets posted by 2,547 Singapore Twitter users from the public domain. Breaking down these corpuses into segments of 1,000 words, the usage of linguistic features by each subject was calculated. On the other hand, six undergraduate observers assessed the personality traits for these subjects according to the Big Five Personality Model using the BFI-10 inventory. These observers were trained and briefed before they began their assessment, which lasted slightly more than two weeks. At the end of the assessment, three main issues were raised that might affect the reliability of the results. First, the differences in the understanding of the judging criteria affected the inter-rater agreement between the observers when measured using Krippendorff's Alpha. Second, fatigue affected the performance of the observers. Third, the lack of a reviewing mechanism decreased the reliability of the results as well.. Nevertheless, third-party observation is still recommended for studies with huge sample size. Using Pearson's Correlation Coefficients to determine the correlations between personality traits and the usage of linguistic features, there were quite a significant number of differences as compared to research done on Westerners. Undeniably, these differences reasonably hypothesized that cultural differences do play a part in the correlation between personality traits and usage of linguistic features. Our results supported the hypothesis that first-person pronouns correlated significantly with Collectivism according to previous studies. Specifically, first-person pronouns in the singular and plural forms reported inversed correlations with each other highlighted the possible influence that cultural factors may have on the results. Furthermore, the usage of vulgarities bearing inversed correlations with first-person pronouns in the plural forms also augmented the hypothesis since the usage of vulgarities in a collectivistic society is undesirable. On the whole, this dissertation seeks to advance the current studies in two main areas. First, by studying the inter-rater agreement of the observers, we were able to identify a few potential issues that may hamper the reliability of adopting third-party observation technique for the assessment of personality traits. Without a doubt, this allows researchers to prepare pre-emptive measures to increase the reliability of the results they gathered. Second, the results that we gathered draw attention to the cultural influence on the usage of linguistic features, which may eventually increase the accuracy of artificial intelligence system that may be built to identify personality traits of subjects residing in a Social Networking Site, or even other systems as long as corpuses can be collected.