Automatic occupation identification of Twitter users using graph neural network

With the advent of the digital age, the Internet has gradually become an integral part of people's lives. The emergence and development of the Internet have significantly improved people's lives, and information can be spread to all parts of the world in a very short period of time through...

Full description

Saved in:

Bibliographic Details
Main Author:	Li, Jiaheng
Other Authors:	Na Jin Cheon
Format:	Thesis-Master by Coursework
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Social sciences::Communication
Online Access:	https://hdl.handle.net/10356/166247
Tags:	Add Tag No Tags, Be the first to tag this record!

id	sg-ntu-dr.10356-166247
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Social sciences::Communication
spellingShingle	Social sciences::Communication Li, Jiaheng Automatic occupation identification of Twitter users using graph neural network
description	With the advent of the digital age, the Internet has gradually become an integral part of people's lives. The emergence and development of the Internet have significantly improved people's lives, and information can be spread to all parts of the world in a very short period of time through the Internet. So, people can communicate with each other across distances and sit at home and understand what's happening worldwide. As one of the most popular online social platforms in the world, Twitter has brought great convenience to the Internet for people to transmit information. People can tweet about what's happening around them, find out what's happening in other parts of the world, find the articles and information they want, and more. Thus, Twitter also facilitates the democratization of scholarly articles. More and more users are now publishing academic articles on Twitter because they want to share their research results, want to have discussions on Twitter with practitioners in related fields covered by others, and want to share cutting-edge research results. Both researchers and the public have seen Twitter as a search platform for scholarly articles. Slowly, more and more researchers have discovered that they can use Twitter to investigate and analyze what academic fields are more interesting to the public. In this article, we build a classifier that helps researchers in such analytical studies. My purpose is to automatically classify the user types of Twitter accounts that post academic articles on Twitter so that future researchers can use the data provided by this project to analyze the audience of their research articles. This will help to understand better the needs of the public for scientific information and can shed light on future research directions. I used an existing dataset that was collected from Almetrics.com. These data include specific account information of Twitter users who posted academic papers, including account names, personal descriptions, tweets, collections, forwarding, etc. These datasets include eight different user occupations, Academic publishers, Academic researchers & institutions, Health science professionals & institutions, Mass Media, Non-academic researchers & institutions, Research feeds, Topic feeds & news alerts and Others. Next, the collected data have a graph for each Twitter account. The edges in the graph represented the relationship between other accounts, and the vertex represented a Twitter account. Then, a variety of graph neural network algorithms was applied to establish a classifier to automatically classify the accounts that publish academic articles. I used some original graph neural network algorithms, including GATv1, GATv2, GraphSage, GIN, TransformerConv, and Bert&GAT, and created some novel algorithms including Bert&GraphSAGE, Bert&TransformerConv, and TransformerConv&Linear. Afterward, I built nine different classifiers using these algorithms. After that, these several classifiers are compared and adjusted to find the classifier with the best performance. Finally, a novel graph neural network classifier with excellent performance is built. In this article, after several experimental comparisons, the results show that the BERT&TransformerConv algorithm has the best performance among the nine algorithms. Finally, the test accuracy of the classifier built by this algorithm reached 86%. But the study also had limitations. The amount of experimental data I used is not particularly large, and the IP addresses of all Twitter accounts are local to Singapore. Therefore, the accuracy of this classifier in other regions remains to be tested.
author2	Na Jin Cheon
author_facet	Na Jin Cheon Li, Jiaheng
format	Thesis-Master by Coursework
author	Li, Jiaheng
author_sort	Li, Jiaheng
title	Automatic occupation identification of Twitter users using graph neural network
title_short	Automatic occupation identification of Twitter users using graph neural network
title_full	Automatic occupation identification of Twitter users using graph neural network
title_fullStr	Automatic occupation identification of Twitter users using graph neural network
title_full_unstemmed	Automatic occupation identification of Twitter users using graph neural network
title_sort	automatic occupation identification of twitter users using graph neural network
publisher	Nanyang Technological University
publishDate	2023
url	https://hdl.handle.net/10356/166247
_version_	1764208022387687424
spelling	sg-ntu-dr.10356-1662472023-04-23T15:40:28Z Automatic occupation identification of Twitter users using graph neural network Li, Jiaheng Na Jin Cheon Wee Kim Wee School of Communication and Information TJCNa@ntu.edu.sg Social sciences::Communication With the advent of the digital age, the Internet has gradually become an integral part of people's lives. The emergence and development of the Internet have significantly improved people's lives, and information can be spread to all parts of the world in a very short period of time through the Internet. So, people can communicate with each other across distances and sit at home and understand what's happening worldwide. As one of the most popular online social platforms in the world, Twitter has brought great convenience to the Internet for people to transmit information. People can tweet about what's happening around them, find out what's happening in other parts of the world, find the articles and information they want, and more. Thus, Twitter also facilitates the democratization of scholarly articles. More and more users are now publishing academic articles on Twitter because they want to share their research results, want to have discussions on Twitter with practitioners in related fields covered by others, and want to share cutting-edge research results. Both researchers and the public have seen Twitter as a search platform for scholarly articles. Slowly, more and more researchers have discovered that they can use Twitter to investigate and analyze what academic fields are more interesting to the public. In this article, we build a classifier that helps researchers in such analytical studies. My purpose is to automatically classify the user types of Twitter accounts that post academic articles on Twitter so that future researchers can use the data provided by this project to analyze the audience of their research articles. This will help to understand better the needs of the public for scientific information and can shed light on future research directions. I used an existing dataset that was collected from Almetrics.com. These data include specific account information of Twitter users who posted academic papers, including account names, personal descriptions, tweets, collections, forwarding, etc. These datasets include eight different user occupations, Academic publishers, Academic researchers & institutions, Health science professionals & institutions, Mass Media, Non-academic researchers & institutions, Research feeds, Topic feeds & news alerts and Others. Next, the collected data have a graph for each Twitter account. The edges in the graph represented the relationship between other accounts, and the vertex represented a Twitter account. Then, a variety of graph neural network algorithms was applied to establish a classifier to automatically classify the accounts that publish academic articles. I used some original graph neural network algorithms, including GATv1, GATv2, GraphSage, GIN, TransformerConv, and Bert&GAT, and created some novel algorithms including Bert&GraphSAGE, Bert&TransformerConv, and TransformerConv&Linear. Afterward, I built nine different classifiers using these algorithms. After that, these several classifiers are compared and adjusted to find the classifier with the best performance. Finally, a novel graph neural network classifier with excellent performance is built. In this article, after several experimental comparisons, the results show that the BERT&TransformerConv algorithm has the best performance among the nine algorithms. Finally, the test accuracy of the classifier built by this algorithm reached 86%. But the study also had limitations. The amount of experimental data I used is not particularly large, and the IP addresses of all Twitter accounts are local to Singapore. Therefore, the accuracy of this classifier in other regions remains to be tested. Master of Science (Information Systems) 2023-04-19T00:20:20Z 2023-04-19T00:20:20Z 2023 Thesis-Master by Coursework Li, J. (2023). Automatic occupation identification of Twitter users using graph neural network. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166247 https://hdl.handle.net/10356/166247 en application/pdf Nanyang Technological University

Automatic occupation identification of Twitter users using graph neural network

Similar Items