DEVELOPING COVID-19 INFORMATION VALIDATION SYSTEM USING NATURAL LANGUAGE INFERENCE WITH DEEP LEARNING

In the recent COVID-19 pandemic outbreak, a lot of information about the disease is spreading, without being able to validate its truth. Therefore, in this final project, an information validation system about COVID-19 was developed, which refers to facts from scientific articles. The final project...

Full description

Saved in:
Bibliographic Details
Main Author: Gde Aditya Taguh Widiana, Putu
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/55856
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:55856
spelling id-itb.:558562021-06-19T18:08:56ZDEVELOPING COVID-19 INFORMATION VALIDATION SYSTEM USING NATURAL LANGUAGE INFERENCE WITH DEEP LEARNING Gde Aditya Taguh Widiana, Putu Indonesia Final Project information validation, Natural Language Inference, BERT, RoBERTa INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/55856 In the recent COVID-19 pandemic outbreak, a lot of information about the disease is spreading, without being able to validate its truth. Therefore, in this final project, an information validation system about COVID-19 was developed, which refers to facts from scientific articles. The final project aims to build an information validation system regarding COVID-19 using Natural Language Inference (NLI) task with deep learning, which includes system architecture, selection and consruction of training data, as well as developing deep learning model for NLI. This system consists of two main modules, namely the fact finding module and the sentence comparison module. The fact-finding module uses scientific articles from the CORD-19 dataset stored on Elasticsearch to find relevant facts. The sentence comparison module (inference module) compares relevant facts with validated information, which results in entailment, neutral, and contradictory relationships. This module implements the Natural Language Inference (NLI) task, using the large RoBERTa model, with the addition of a classifier layer. In using NLI, the facts from the previous module are used as premises, and the information to be validated is used as a hypothesis. This final project focuses on developing a deep learning model for the Natural Language Inference task. The large RoBERTa model with combined training data from the SNLI and MultiNLI datasets was selected based on the experimental results. The model managed to achieve an accuracy of 0.9322 on SNLI dev, 0.9253 on SNLI test, 0.9058 on MultiNLI dev matched, and 0.9052 on MultiNLI mismatched. The model also has a good performance on the “stress test for NLI”, with an accuracy of 0.7466 in the competence test and 0.8776 in the noise test. The model has a weakness against distraction in the hypothesis. The model is used for the inference module, and together with the fact-finding module that has been built to form an information validation system for COVID-19. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description In the recent COVID-19 pandemic outbreak, a lot of information about the disease is spreading, without being able to validate its truth. Therefore, in this final project, an information validation system about COVID-19 was developed, which refers to facts from scientific articles. The final project aims to build an information validation system regarding COVID-19 using Natural Language Inference (NLI) task with deep learning, which includes system architecture, selection and consruction of training data, as well as developing deep learning model for NLI. This system consists of two main modules, namely the fact finding module and the sentence comparison module. The fact-finding module uses scientific articles from the CORD-19 dataset stored on Elasticsearch to find relevant facts. The sentence comparison module (inference module) compares relevant facts with validated information, which results in entailment, neutral, and contradictory relationships. This module implements the Natural Language Inference (NLI) task, using the large RoBERTa model, with the addition of a classifier layer. In using NLI, the facts from the previous module are used as premises, and the information to be validated is used as a hypothesis. This final project focuses on developing a deep learning model for the Natural Language Inference task. The large RoBERTa model with combined training data from the SNLI and MultiNLI datasets was selected based on the experimental results. The model managed to achieve an accuracy of 0.9322 on SNLI dev, 0.9253 on SNLI test, 0.9058 on MultiNLI dev matched, and 0.9052 on MultiNLI mismatched. The model also has a good performance on the “stress test for NLI”, with an accuracy of 0.7466 in the competence test and 0.8776 in the noise test. The model has a weakness against distraction in the hypothesis. The model is used for the inference module, and together with the fact-finding module that has been built to form an information validation system for COVID-19.
format Final Project
author Gde Aditya Taguh Widiana, Putu
spellingShingle Gde Aditya Taguh Widiana, Putu
DEVELOPING COVID-19 INFORMATION VALIDATION SYSTEM USING NATURAL LANGUAGE INFERENCE WITH DEEP LEARNING
author_facet Gde Aditya Taguh Widiana, Putu
author_sort Gde Aditya Taguh Widiana, Putu
title DEVELOPING COVID-19 INFORMATION VALIDATION SYSTEM USING NATURAL LANGUAGE INFERENCE WITH DEEP LEARNING
title_short DEVELOPING COVID-19 INFORMATION VALIDATION SYSTEM USING NATURAL LANGUAGE INFERENCE WITH DEEP LEARNING
title_full DEVELOPING COVID-19 INFORMATION VALIDATION SYSTEM USING NATURAL LANGUAGE INFERENCE WITH DEEP LEARNING
title_fullStr DEVELOPING COVID-19 INFORMATION VALIDATION SYSTEM USING NATURAL LANGUAGE INFERENCE WITH DEEP LEARNING
title_full_unstemmed DEVELOPING COVID-19 INFORMATION VALIDATION SYSTEM USING NATURAL LANGUAGE INFERENCE WITH DEEP LEARNING
title_sort developing covid-19 information validation system using natural language inference with deep learning
url https://digilib.itb.ac.id/gdl/view/55856
_version_ 1822930022291210240