CRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT
Classification of crimes in Indonesia based on data from social media, X (Twitter), is a significant challenge in detecting and preventing criminal acts in this digital age. With the increasing use of social media platforms as a means of communication, it is important to develop an effective syst...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/85552 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Classification of crimes in Indonesia based on data from social media, X
(Twitter), is a significant challenge in detecting and preventing criminal acts in
this digital age. With the increasing use of social media platforms as a means of
communication, it is important to develop an effective system in analyzing and
classifying crime-related information to help law enforcement and society.
This research offers a solution by utilizing a pretrained model IndoBERT for the
classification of crime. Data collected from X (Twitter) consisting of tweets
related to the type of crime, namely murder, violence, rape, kidnapping, theft,
narcotics, and fraud. IndoBERT, which has been trained on a large number of
Indonesian-language texts, was adapted for this classification task, while the
combination of Word2Vec-LSTM as the baseline model.
The evaluation results showed that the IndoBERT model achieved an accuracy of
99.20% and the F1-Score of 98.90%. The IndoBERT model delivered a better
performance in terms of precision compared to the Word2Vec-LSTM model with
an improvement of 0.70%. The research is expected to contribute to the
development of a more responsive and accurate crime detection system, as well as
describe the great potential of a transformer-based model in text analysis in
a local context. |
---|