An unsupervised multilingual approach for online social media topic identification

Social media data can be valuable in many ways. However, the vast amount of content shared and the linguistic variants of languages used on social media are making it very challenging for high-value topics to be identified. In this paper, we present an unsupervised multilingual approach for identify...

Full description

Saved in:
Bibliographic Details
Main Authors: LO, Siaw Ling, CHIONG, Raymond, CORNFORTH, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2017
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4873
https://ink.library.smu.edu.sg/context/sis_research/article/5876/viewcontent/An_unsupervised___PV.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5876
record_format dspace
spelling sg-smu-ink.sis_research-58762020-02-13T08:50:11Z An unsupervised multilingual approach for online social media topic identification LO, Siaw Ling CHIONG, Raymond CORNFORTH, David Social media data can be valuable in many ways. However, the vast amount of content shared and the linguistic variants of languages used on social media are making it very challenging for high-value topics to be identified. In this paper, we present an unsupervised multilingual approach for identifying highly relevant terms and topics from the mass of social media data. This approach combines term ranking, localised language analysis, unsupervised topic clustering and multilingual sentiment analysis to extract prominent topics through analysis of Twitter’s tweets from a period of time. It is observed that each of the ranking methods tested has their strengths and weaknesses, and that our proposed ‘Joint’ ranking method is able to take advantage of the strengths of the ranking methods. This ‘Joint’ ranking method coupled with an unsupervised topic clustering model is shown to have the potential to discover topics of interest or concern to a local community. Practically, being able to do so may help decision makers to gauge the true opinions or concerns on the ground. Theoretically, the research is significant as it shows how an unsupervised online topic identification approach can be designed without much manual annotation effort, which may have great implications for future development of expert and intelligent systems. 2017-09-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4873 info:doi/10.1016/j.eswa.2017.03.029 https://ink.library.smu.edu.sg/context/sis_research/article/5876/viewcontent/An_unsupervised___PV.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University topic identification multilingual analysis unsupervised learning social media Computer Engineering Social Media
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic topic identification
multilingual analysis
unsupervised learning
social media
Computer Engineering
Social Media
spellingShingle topic identification
multilingual analysis
unsupervised learning
social media
Computer Engineering
Social Media
LO, Siaw Ling
CHIONG, Raymond
CORNFORTH, David
An unsupervised multilingual approach for online social media topic identification
description Social media data can be valuable in many ways. However, the vast amount of content shared and the linguistic variants of languages used on social media are making it very challenging for high-value topics to be identified. In this paper, we present an unsupervised multilingual approach for identifying highly relevant terms and topics from the mass of social media data. This approach combines term ranking, localised language analysis, unsupervised topic clustering and multilingual sentiment analysis to extract prominent topics through analysis of Twitter’s tweets from a period of time. It is observed that each of the ranking methods tested has their strengths and weaknesses, and that our proposed ‘Joint’ ranking method is able to take advantage of the strengths of the ranking methods. This ‘Joint’ ranking method coupled with an unsupervised topic clustering model is shown to have the potential to discover topics of interest or concern to a local community. Practically, being able to do so may help decision makers to gauge the true opinions or concerns on the ground. Theoretically, the research is significant as it shows how an unsupervised online topic identification approach can be designed without much manual annotation effort, which may have great implications for future development of expert and intelligent systems.
format text
author LO, Siaw Ling
CHIONG, Raymond
CORNFORTH, David
author_facet LO, Siaw Ling
CHIONG, Raymond
CORNFORTH, David
author_sort LO, Siaw Ling
title An unsupervised multilingual approach for online social media topic identification
title_short An unsupervised multilingual approach for online social media topic identification
title_full An unsupervised multilingual approach for online social media topic identification
title_fullStr An unsupervised multilingual approach for online social media topic identification
title_full_unstemmed An unsupervised multilingual approach for online social media topic identification
title_sort unsupervised multilingual approach for online social media topic identification
publisher Institutional Knowledge at Singapore Management University
publishDate 2017
url https://ink.library.smu.edu.sg/sis_research/4873
https://ink.library.smu.edu.sg/context/sis_research/article/5876/viewcontent/An_unsupervised___PV.pdf
_version_ 1770575080585166848