Mining social media data
In recent years, there have been a huge growth in the use of social media. Despite the huge amount of social media data available, they are still not fully utilised. Hence, there is a need for social media mining to find patterns and make sense of the data available. This study sought to predi...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/66709 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | In recent years, there have been a huge growth in the use of social media. Despite
the huge amount of social media data available, they are still not fully utilised. Hence,
there is a need for social media mining to find patterns and make sense of the data
available. This study sought to predict popular topics by examining them on Twitter
over a time-window of 7 days. Through the application of three classification
algorithms, namely, Decision Tree Classifiers, Naïve Bayes Classifiers and Support
Vector Machines, and compare the performance of these three classification
algorithms to find the most effective algorithm for mining two different types of class
labels, Absolute and Relative Addressing. The results obtained showed that Support
Vector Machines produced more accurate results while taking a substantial amount
of time to process. Decision Tree Classifiers, on the other hand, took a much shorter
time to process, but still able to predict with only a slightly lower accuracy than
Support Vector Machines. Therefore, mining Twitter data prove to be useful in
predicting popular topics, and mining social media data can be an effective method
for commercial purposes. While this study focuses only on three classification
algorithms and one data set with two types of class labels, further studies on other
social media, algorithms and more data sets can be done in order to provide more
accurate and comprehensive findings. |
---|