A model for age and gender profiling of social media accounts based on post contents

The growth of social networking platforms such as Facebook and Twitter has bridged communication channels between people to share their thoughts and sentiments. However, along with the rapid growth and rise of the Internet, the idea of anonymity has also been introduced wherein user identities are e...

Full description

Saved in:
Bibliographic Details
Main Authors: Cheng, Jan Kristoffer, Fernandez, Avril, Quindoza, Rissa Grace Marie, Tan, Shayane, Cheng, Charibeth
Format: text
Published: Animo Repository 2018
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/3514
https://animorepository.dlsu.edu.ph/context/faculty_research/article/4516/type/native/viewcontent/978_3_030_04179_3_10.html
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-4516
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-45162021-09-11T08:30:38Z A model for age and gender profiling of social media accounts based on post contents Cheng, Jan Kristoffer Fernandez, Avril Quindoza, Rissa Grace Marie Tan, Shayane Cheng, Charibeth The growth of social networking platforms such as Facebook and Twitter has bridged communication channels between people to share their thoughts and sentiments. However, along with the rapid growth and rise of the Internet, the idea of anonymity has also been introduced wherein user identities are easily falsified and hidden. Hence, presenting difficulty for businesses to give accurate advertisements to specific account demographics. As such, this study searched for the best model to identify gender and age group of Filipino social media accounts through analyzing post contents. Two model structures for the classifier namely, the stacked/combined structure and the parallel structure were experimented on. Different types of features including those based on socio-linguistics, grammar, characters and words were considered. The results show that different model structures, features, feature reduction and classification algorithms apply to age classification and gender classification. For Facebook and Twitter, the best model for classifying age was Support Vector Classifier (SVC) with least absolute shrinkage and selection operator (Lasso) on a parallel model structure for Facebook, while a combined model structure is best for Twitter. For gender classification, the best model for Facebook used Ridge Classifier (RC), while the best model for Twitter used SVC, both utilizing Lasso on a parallel model structure. The features that were dominant in age classification for both Facebook and Twitter were word-based, socio-linguistic features and post time, while socio-linguistic features, specifically netspeak, were important in gender classification for both platforms. Based on the differences of the features affecting the performance of the models, Facebook and Twitter data must be analyzed separately as the posts found in these two platforms differ significantly. © 2018, Springer Nature Switzerland AG. 2018-01-01T08:00:00Z text text/html https://animorepository.dlsu.edu.ph/faculty_research/3514 info:doi/10.1007/978-3-030-04179-3_10 https://animorepository.dlsu.edu.ph/context/faculty_research/article/4516/type/native/viewcontent/978_3_030_04179_3_10.html Faculty Research Work Animo Repository Consumer profiling Computational linguistics Machine learning
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
topic Consumer profiling
Computational linguistics
Machine learning
spellingShingle Consumer profiling
Computational linguistics
Machine learning
Cheng, Jan Kristoffer
Fernandez, Avril
Quindoza, Rissa Grace Marie
Tan, Shayane
Cheng, Charibeth
A model for age and gender profiling of social media accounts based on post contents
description The growth of social networking platforms such as Facebook and Twitter has bridged communication channels between people to share their thoughts and sentiments. However, along with the rapid growth and rise of the Internet, the idea of anonymity has also been introduced wherein user identities are easily falsified and hidden. Hence, presenting difficulty for businesses to give accurate advertisements to specific account demographics. As such, this study searched for the best model to identify gender and age group of Filipino social media accounts through analyzing post contents. Two model structures for the classifier namely, the stacked/combined structure and the parallel structure were experimented on. Different types of features including those based on socio-linguistics, grammar, characters and words were considered. The results show that different model structures, features, feature reduction and classification algorithms apply to age classification and gender classification. For Facebook and Twitter, the best model for classifying age was Support Vector Classifier (SVC) with least absolute shrinkage and selection operator (Lasso) on a parallel model structure for Facebook, while a combined model structure is best for Twitter. For gender classification, the best model for Facebook used Ridge Classifier (RC), while the best model for Twitter used SVC, both utilizing Lasso on a parallel model structure. The features that were dominant in age classification for both Facebook and Twitter were word-based, socio-linguistic features and post time, while socio-linguistic features, specifically netspeak, were important in gender classification for both platforms. Based on the differences of the features affecting the performance of the models, Facebook and Twitter data must be analyzed separately as the posts found in these two platforms differ significantly. © 2018, Springer Nature Switzerland AG.
format text
author Cheng, Jan Kristoffer
Fernandez, Avril
Quindoza, Rissa Grace Marie
Tan, Shayane
Cheng, Charibeth
author_facet Cheng, Jan Kristoffer
Fernandez, Avril
Quindoza, Rissa Grace Marie
Tan, Shayane
Cheng, Charibeth
author_sort Cheng, Jan Kristoffer
title A model for age and gender profiling of social media accounts based on post contents
title_short A model for age and gender profiling of social media accounts based on post contents
title_full A model for age and gender profiling of social media accounts based on post contents
title_fullStr A model for age and gender profiling of social media accounts based on post contents
title_full_unstemmed A model for age and gender profiling of social media accounts based on post contents
title_sort model for age and gender profiling of social media accounts based on post contents
publisher Animo Repository
publishDate 2018
url https://animorepository.dlsu.edu.ph/faculty_research/3514
https://animorepository.dlsu.edu.ph/context/faculty_research/article/4516/type/native/viewcontent/978_3_030_04179_3_10.html
_version_ 1767195920909205504