CRIME CLASSIFICATION FROM SOCIAL MEDIA X (TWITTER) DATA USING BERT

Classification of crimes in Indonesia based on data from social media, X (Twitter), is a significant challenge in detecting and preventing criminal acts in this digital age. With the increasing use of social media platforms as a means of communication, it is important to develop an effective syst...

Full description

Saved in:
Bibliographic Details
Main Author: Utami Amaliah. W, Nurul
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/85552
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Classification of crimes in Indonesia based on data from social media, X (Twitter), is a significant challenge in detecting and preventing criminal acts in this digital age. With the increasing use of social media platforms as a means of communication, it is important to develop an effective system in analyzing and classifying crime-related information to help law enforcement and society. This research offers a solution by utilizing a pretrained model IndoBERT for the classification of crime. Data collected from X (Twitter) consisting of tweets related to the type of crime, namely murder, violence, rape, kidnapping, theft, narcotics, and fraud. IndoBERT, which has been trained on a large number of Indonesian-language texts, was adapted for this classification task, while the combination of Word2Vec-LSTM as the baseline model. The evaluation results showed that the IndoBERT model achieved an accuracy of 99.20% and the F1-Score of 98.90%. The IndoBERT model delivered a better performance in terms of precision compared to the Word2Vec-LSTM model with an improvement of 0.70%. The research is expected to contribute to the development of a more responsive and accurate crime detection system, as well as describe the great potential of a transformer-based model in text analysis in a local context.