CRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT
Classification of crimes in Indonesia based on data from social media, X (Twitter), is a significant challenge in detecting and preventing criminal acts in this digital age. With the increasing use of social media platforms as a means of communication, it is important to develop an effective syst...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/85552 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:85552 |
---|---|
spelling |
id-itb.:855522024-08-21T19:26:47ZCRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT Utami Amaliah. W, Nurul Indonesia Final Project Text Classification, IndoBERT, Crime INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/85552 Classification of crimes in Indonesia based on data from social media, X (Twitter), is a significant challenge in detecting and preventing criminal acts in this digital age. With the increasing use of social media platforms as a means of communication, it is important to develop an effective system in analyzing and classifying crime-related information to help law enforcement and society. This research offers a solution by utilizing a pretrained model IndoBERT for the classification of crime. Data collected from X (Twitter) consisting of tweets related to the type of crime, namely murder, violence, rape, kidnapping, theft, narcotics, and fraud. IndoBERT, which has been trained on a large number of Indonesian-language texts, was adapted for this classification task, while the combination of Word2Vec-LSTM as the baseline model. The evaluation results showed that the IndoBERT model achieved an accuracy of 99.20% and the F1-Score of 98.90%. The IndoBERT model delivered a better performance in terms of precision compared to the Word2Vec-LSTM model with an improvement of 0.70%. The research is expected to contribute to the development of a more responsive and accurate crime detection system, as well as describe the great potential of a transformer-based model in text analysis in a local context. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Classification of crimes in Indonesia based on data from social media, X
(Twitter), is a significant challenge in detecting and preventing criminal acts in
this digital age. With the increasing use of social media platforms as a means of
communication, it is important to develop an effective system in analyzing and
classifying crime-related information to help law enforcement and society.
This research offers a solution by utilizing a pretrained model IndoBERT for the
classification of crime. Data collected from X (Twitter) consisting of tweets
related to the type of crime, namely murder, violence, rape, kidnapping, theft,
narcotics, and fraud. IndoBERT, which has been trained on a large number of
Indonesian-language texts, was adapted for this classification task, while the
combination of Word2Vec-LSTM as the baseline model.
The evaluation results showed that the IndoBERT model achieved an accuracy of
99.20% and the F1-Score of 98.90%. The IndoBERT model delivered a better
performance in terms of precision compared to the Word2Vec-LSTM model with
an improvement of 0.70%. The research is expected to contribute to the
development of a more responsive and accurate crime detection system, as well as
describe the great potential of a transformer-based model in text analysis in
a local context. |
format |
Final Project |
author |
Utami Amaliah. W, Nurul |
spellingShingle |
Utami Amaliah. W, Nurul CRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT |
author_facet |
Utami Amaliah. W, Nurul |
author_sort |
Utami Amaliah. W, Nurul |
title |
CRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT |
title_short |
CRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT |
title_full |
CRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT |
title_fullStr |
CRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT |
title_full_unstemmed |
CRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT |
title_sort |
crime classification from social media x (twitter) data using bert |
url |
https://digilib.itb.ac.id/gdl/view/85552 |
_version_ |
1822999208879194112 |