Online tweet summarization : a topic modelling-based approach

Twitter is an online social networking service, in which users post short messages called “tweets”. Twitter users can follow other users, forming a network whereby a user receives all the tweets posted by the users that he/she follows. Similar to many other social networking services such as Faceboo...

Full description

Saved in:
Bibliographic Details
Main Author: Chin, Jin Yao
Other Authors: Sourav Saha Bhowmick
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/67285
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-67285
record_format dspace
spelling sg-ntu-dr.10356-672852023-03-03T20:38:53Z Online tweet summarization : a topic modelling-based approach Chin, Jin Yao Sourav Saha Bhowmick School of Computer Engineering DRNTU::Engineering::Computer science and engineering Twitter is an online social networking service, in which users post short messages called “tweets”. Twitter users can follow other users, forming a network whereby a user receives all the tweets posted by the users that he/she follows. Similar to many other social networking services such as Facebook and Instagram, Twitter has adopted a reverse chronological timeline since its release. The reverse chronological timeline is inadequate due to two main reasons: (a) the most recent posts could be repeating the same information, and (b) it can be relatively difficult for the users to see the overall picture of the topics being discussed in the entire collection of most recent posts. To overcome the limitations of the reverse chronological timeline, we present an alternative approach based on topic modelling in this project. Topic modelling is a text mining technique used to identify hidden topics from a collection of text documents, and we adopted the most basic and widely used topic model based on the Latent Dirichlet Allocation for the project. Beyond identifying the most salient topics in a collection of tweets, we also examined and proposed solutions to issues such as ranking of the tweets based on its relevance to the topic, as well as the generation of topic labels and topic summaries. A pilot user study involving 20 participants was conducted to evaluate the performance of the proposed solution. The findings from the user study show that the topic modelling-based approach outperforms the reverse chronological baseline in many areas, and highlights the feasibility of the topic modelling-based approach. The user study has also helped to identify areas of future work which can help to further enhance the proposed solution. Bachelor of Engineering (Computer Science) 2016-05-13T06:44:55Z 2016-05-13T06:44:55Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/67285 en Nanyang Technological University 114 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Chin, Jin Yao
Online tweet summarization : a topic modelling-based approach
description Twitter is an online social networking service, in which users post short messages called “tweets”. Twitter users can follow other users, forming a network whereby a user receives all the tweets posted by the users that he/she follows. Similar to many other social networking services such as Facebook and Instagram, Twitter has adopted a reverse chronological timeline since its release. The reverse chronological timeline is inadequate due to two main reasons: (a) the most recent posts could be repeating the same information, and (b) it can be relatively difficult for the users to see the overall picture of the topics being discussed in the entire collection of most recent posts. To overcome the limitations of the reverse chronological timeline, we present an alternative approach based on topic modelling in this project. Topic modelling is a text mining technique used to identify hidden topics from a collection of text documents, and we adopted the most basic and widely used topic model based on the Latent Dirichlet Allocation for the project. Beyond identifying the most salient topics in a collection of tweets, we also examined and proposed solutions to issues such as ranking of the tweets based on its relevance to the topic, as well as the generation of topic labels and topic summaries. A pilot user study involving 20 participants was conducted to evaluate the performance of the proposed solution. The findings from the user study show that the topic modelling-based approach outperforms the reverse chronological baseline in many areas, and highlights the feasibility of the topic modelling-based approach. The user study has also helped to identify areas of future work which can help to further enhance the proposed solution.
author2 Sourav Saha Bhowmick
author_facet Sourav Saha Bhowmick
Chin, Jin Yao
format Final Year Project
author Chin, Jin Yao
author_sort Chin, Jin Yao
title Online tweet summarization : a topic modelling-based approach
title_short Online tweet summarization : a topic modelling-based approach
title_full Online tweet summarization : a topic modelling-based approach
title_fullStr Online tweet summarization : a topic modelling-based approach
title_full_unstemmed Online tweet summarization : a topic modelling-based approach
title_sort online tweet summarization : a topic modelling-based approach
publishDate 2016
url http://hdl.handle.net/10356/67285
_version_ 1759856500355891200