Online tweet summarization : a topic modelling-based approach
Twitter is an online social networking service, in which users post short messages called “tweets”. Twitter users can follow other users, forming a network whereby a user receives all the tweets posted by the users that he/she follows. Similar to many other social networking services such as Faceboo...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/67285 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-67285 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-672852023-03-03T20:38:53Z Online tweet summarization : a topic modelling-based approach Chin, Jin Yao Sourav Saha Bhowmick School of Computer Engineering DRNTU::Engineering::Computer science and engineering Twitter is an online social networking service, in which users post short messages called “tweets”. Twitter users can follow other users, forming a network whereby a user receives all the tweets posted by the users that he/she follows. Similar to many other social networking services such as Facebook and Instagram, Twitter has adopted a reverse chronological timeline since its release. The reverse chronological timeline is inadequate due to two main reasons: (a) the most recent posts could be repeating the same information, and (b) it can be relatively difficult for the users to see the overall picture of the topics being discussed in the entire collection of most recent posts. To overcome the limitations of the reverse chronological timeline, we present an alternative approach based on topic modelling in this project. Topic modelling is a text mining technique used to identify hidden topics from a collection of text documents, and we adopted the most basic and widely used topic model based on the Latent Dirichlet Allocation for the project. Beyond identifying the most salient topics in a collection of tweets, we also examined and proposed solutions to issues such as ranking of the tweets based on its relevance to the topic, as well as the generation of topic labels and topic summaries. A pilot user study involving 20 participants was conducted to evaluate the performance of the proposed solution. The findings from the user study show that the topic modelling-based approach outperforms the reverse chronological baseline in many areas, and highlights the feasibility of the topic modelling-based approach. The user study has also helped to identify areas of future work which can help to further enhance the proposed solution. Bachelor of Engineering (Computer Science) 2016-05-13T06:44:55Z 2016-05-13T06:44:55Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/67285 en Nanyang Technological University 114 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering |
spellingShingle |
DRNTU::Engineering::Computer science and engineering Chin, Jin Yao Online tweet summarization : a topic modelling-based approach |
description |
Twitter is an online social networking service, in which users post short messages called “tweets”. Twitter users can follow other users, forming a network whereby a user receives all the tweets posted by the users that he/she follows. Similar to many other social networking services such as Facebook and Instagram, Twitter has adopted a reverse chronological timeline since its release. The reverse chronological timeline is inadequate due to two main reasons: (a) the most recent posts could be repeating the same information, and (b) it can be relatively difficult for the users to see the overall picture of the topics being discussed in the entire collection of most recent posts.
To overcome the limitations of the reverse chronological timeline, we present an alternative approach based on topic modelling in this project. Topic modelling is a text mining technique used to identify hidden topics from a collection of text documents, and we adopted the most basic and widely used topic model based on the Latent Dirichlet Allocation for the project. Beyond identifying the most salient topics in a collection of tweets, we also examined and proposed solutions to issues such as ranking of the tweets based on its relevance to the topic, as well as the generation of topic labels and topic summaries.
A pilot user study involving 20 participants was conducted to evaluate the performance of the proposed solution. The findings from the user study show that the topic modelling-based approach outperforms the reverse chronological baseline in many areas, and highlights the feasibility of the topic modelling-based approach. The user study has also helped to identify areas of future work which can help to further enhance the proposed solution. |
author2 |
Sourav Saha Bhowmick |
author_facet |
Sourav Saha Bhowmick Chin, Jin Yao |
format |
Final Year Project |
author |
Chin, Jin Yao |
author_sort |
Chin, Jin Yao |
title |
Online tweet summarization : a topic modelling-based approach |
title_short |
Online tweet summarization : a topic modelling-based approach |
title_full |
Online tweet summarization : a topic modelling-based approach |
title_fullStr |
Online tweet summarization : a topic modelling-based approach |
title_full_unstemmed |
Online tweet summarization : a topic modelling-based approach |
title_sort |
online tweet summarization : a topic modelling-based approach |
publishDate |
2016 |
url |
http://hdl.handle.net/10356/67285 |
_version_ |
1759856500355891200 |