INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN

Today's social media, especially the Twitter platform has become the most popular information source to find out the traffic conditions in real-time. Generally the use of information posted on Twitter is used for short-term purposes, only to find out the congestion points during the event. If t...

Full description

Saved in:
Bibliographic Details
Main Author: RIZA ALIFI - NIM: 23515021 , M.
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/29049
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:29049
spelling id-itb.:290492018-10-01T10:09:05ZINFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN RIZA ALIFI - NIM: 23515021 , M. Indonesia Theses INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/29049 Today's social media, especially the Twitter platform has become the most popular information source to find out the traffic conditions in real-time. Generally the use of information posted on Twitter is used for short-term purposes, only to find out the congestion points during the event. If the information can be collected and processed further, it will be more useful for long-term needs, such as mapping the points prone to congestion at certain hours. This information is needed by the city stakeholders. <br /> <br /> <br /> <br /> <br /> Information extraction is needed to process information in the form of text from social media that was previously unstructured into structured. Named Entity Recognition (NER) techniques can be applied to obtain entities that represent traffic conditions. This study tries to classify entities into 11 classes, namely: B-TIME, ITIME, B-LOCT, I-LOCT, B-COND, I-COND, B-CAUS, I-CAUS, B-WEAT, IWEAT, B-MISC, I-MISC, and O. The defined classes represent entities of time, location, conditions, causes, weather, miscellaneous, and others, accompanied by a BIO coding scheme. <br /> <br /> <br /> <br /> <br /> Several previous studies were found related to the information extraction of traffic conditions from social media. However, most are still dominated by rule-based approaches. This research proposes a solution of architectural model design with a deep learning approach. In handling word level using the Bidirectional LSTM approach. As for handling the character level using the CNN approach. The performance of the combination of the two deep learning methods along with word embedding is able to obtain an F-measure value of 0.789. The data used in this study were 44,102 tweets with a composition of 70% training data, 15% test data, and 15% development data. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Today's social media, especially the Twitter platform has become the most popular information source to find out the traffic conditions in real-time. Generally the use of information posted on Twitter is used for short-term purposes, only to find out the congestion points during the event. If the information can be collected and processed further, it will be more useful for long-term needs, such as mapping the points prone to congestion at certain hours. This information is needed by the city stakeholders. <br /> <br /> <br /> <br /> <br /> Information extraction is needed to process information in the form of text from social media that was previously unstructured into structured. Named Entity Recognition (NER) techniques can be applied to obtain entities that represent traffic conditions. This study tries to classify entities into 11 classes, namely: B-TIME, ITIME, B-LOCT, I-LOCT, B-COND, I-COND, B-CAUS, I-CAUS, B-WEAT, IWEAT, B-MISC, I-MISC, and O. The defined classes represent entities of time, location, conditions, causes, weather, miscellaneous, and others, accompanied by a BIO coding scheme. <br /> <br /> <br /> <br /> <br /> Several previous studies were found related to the information extraction of traffic conditions from social media. However, most are still dominated by rule-based approaches. This research proposes a solution of architectural model design with a deep learning approach. In handling word level using the Bidirectional LSTM approach. As for handling the character level using the CNN approach. The performance of the combination of the two deep learning methods along with word embedding is able to obtain an F-measure value of 0.789. The data used in this study were 44,102 tweets with a composition of 70% training data, 15% test data, and 15% development data.
format Theses
author RIZA ALIFI - NIM: 23515021 , M.
spellingShingle RIZA ALIFI - NIM: 23515021 , M.
INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN
author_facet RIZA ALIFI - NIM: 23515021 , M.
author_sort RIZA ALIFI - NIM: 23515021 , M.
title INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN
title_short INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN
title_full INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN
title_fullStr INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN
title_full_unstemmed INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN
title_sort information extraction of traffic condition from social media using bidirectional lstm-cnn
url https://digilib.itb.ac.id/gdl/view/29049
_version_ 1821995261944135680