Automatic document categorization
With the increasing popularity of social media network in the recent years, the concerns have been raised for the exposure of cyber bullying. The harmful information brings huge negative impact on the mental health of people who are exposed to them, especially teenagers. Therefore, it is essentia...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/67886 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | With the increasing popularity of social media network in the recent years, the concerns
have been raised for the exposure of cyber bullying. The harmful information brings
huge negative impact on the mental health of people who are exposed to them,
especially teenagers. Therefore, it is essential to find an effective way of cyber bullying
detection.
In this paper, we proposed two different models for the text representation and feature
extraction. Introduction to the topic and some related work were presented firstly for a
better understanding of the topic. Then the concept of the two text representation
models Embedding Enhanced Bag-of-Words model and Bullying-Word-Filter model
were introduced. In the experiment part, we applied these two models with some
manually labeled tweets and did the testing. The performances of prediction scores
were illustrated. In the second part, with the classifiers trained in the first part, a case
study concentrating on the cyber bullying cases in Singapore was done.
It wasshown in the paper that our proposed models outperformed many existing models
and worked efficiently in cyber bullying detection. In the future, more works are
supposed to be finished. |
---|