INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN
Today's social media, especially the Twitter platform has become the most popular information source to find out the traffic conditions in real-time. Generally the use of information posted on Twitter is used for short-term purposes, only to find out the congestion points during the event. If t...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/29049 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:29049 |
---|---|
spelling |
id-itb.:290492018-10-01T10:09:05ZINFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN RIZA ALIFI - NIM: 23515021 , M. Indonesia Theses INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/29049 Today's social media, especially the Twitter platform has become the most popular information source to find out the traffic conditions in real-time. Generally the use of information posted on Twitter is used for short-term purposes, only to find out the congestion points during the event. If the information can be collected and processed further, it will be more useful for long-term needs, such as mapping the points prone to congestion at certain hours. This information is needed by the city stakeholders. <br /> <br /> <br /> <br /> <br /> Information extraction is needed to process information in the form of text from social media that was previously unstructured into structured. Named Entity Recognition (NER) techniques can be applied to obtain entities that represent traffic conditions. This study tries to classify entities into 11 classes, namely: B-TIME, ITIME, B-LOCT, I-LOCT, B-COND, I-COND, B-CAUS, I-CAUS, B-WEAT, IWEAT, B-MISC, I-MISC, and O. The defined classes represent entities of time, location, conditions, causes, weather, miscellaneous, and others, accompanied by a BIO coding scheme. <br /> <br /> <br /> <br /> <br /> Several previous studies were found related to the information extraction of traffic conditions from social media. However, most are still dominated by rule-based approaches. This research proposes a solution of architectural model design with a deep learning approach. In handling word level using the Bidirectional LSTM approach. As for handling the character level using the CNN approach. The performance of the combination of the two deep learning methods along with word embedding is able to obtain an F-measure value of 0.789. The data used in this study were 44,102 tweets with a composition of 70% training data, 15% test data, and 15% development data. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Today's social media, especially the Twitter platform has become the most popular information source to find out the traffic conditions in real-time. Generally the use of information posted on Twitter is used for short-term purposes, only to find out the congestion points during the event. If the information can be collected and processed further, it will be more useful for long-term needs, such as mapping the points prone to congestion at certain hours. This information is needed by the city stakeholders. <br />
<br />
<br />
<br />
<br />
Information extraction is needed to process information in the form of text from social media that was previously unstructured into structured. Named Entity Recognition (NER) techniques can be applied to obtain entities that represent traffic conditions. This study tries to classify entities into 11 classes, namely: B-TIME, ITIME, B-LOCT, I-LOCT, B-COND, I-COND, B-CAUS, I-CAUS, B-WEAT, IWEAT, B-MISC, I-MISC, and O. The defined classes represent entities of time, location, conditions, causes, weather, miscellaneous, and others, accompanied by a BIO coding scheme. <br />
<br />
<br />
<br />
<br />
Several previous studies were found related to the information extraction of traffic conditions from social media. However, most are still dominated by rule-based approaches. This research proposes a solution of architectural model design with a deep learning approach. In handling word level using the Bidirectional LSTM approach. As for handling the character level using the CNN approach. The performance of the combination of the two deep learning methods along with word embedding is able to obtain an F-measure value of 0.789. The data used in this study were 44,102 tweets with a composition of 70% training data, 15% test data, and 15% development data. |
format |
Theses |
author |
RIZA ALIFI - NIM: 23515021 , M. |
spellingShingle |
RIZA ALIFI - NIM: 23515021 , M. INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN |
author_facet |
RIZA ALIFI - NIM: 23515021 , M. |
author_sort |
RIZA ALIFI - NIM: 23515021 , M. |
title |
INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN |
title_short |
INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN |
title_full |
INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN |
title_fullStr |
INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN |
title_full_unstemmed |
INFORMATION EXTRACTION OF TRAFFIC CONDITION FROM SOCIAL MEDIA USING BIDIRECTIONAL LSTM-CNN |
title_sort |
information extraction of traffic condition from social media using bidirectional lstm-cnn |
url |
https://digilib.itb.ac.id/gdl/view/29049 |
_version_ |
1821995261944135680 |