VALIDATION OF COVID-19 INFORMATION IN INDONESIAN LANGUAGE USING A NATURAL LANGUAGE INFERENCE AND KNOWLEDGE GRAPH-BASED APPROACH
COVID-19 has emerged as a global public health concern due to its rapid spread and high mortality rate. Effective community-level control of this disease hinges on the availability of accurate information. However, the massive spread of misinformation through the internet makes individuals suscep...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/81541 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | COVID-19 has emerged as a global public health concern due to its rapid spread
and high mortality rate. Effective community-level control of this disease hinges on
the availability of accurate information. However, the massive spread of
misinformation through the internet makes individuals susceptible to false
information. Automated fact-checking systems can play a crucial role in assisting
with the validation of information veracity in such cases. These systems typically
involve a Natural Language Inference (NLI) approach to identify the semantic
relationship between a proven-true premise sentence and a hypothesis sentence
representing the claim to be verified. This semantic relationship can be entailment,
contradiction, or neutral. An information item is considered true if an entailment
relationship exists between the sentences.
This research introduces the use of Knowledge Graphs (KGs) to enhance the
performance of NLI in validating information veracity. KGs serve to augment
factual information, enabling the model to draw conclusions about the truthfulness
of information based on factual evidence. In the proposed model architecture,
information from the KG is processed in a separate module and combined with the
semantic relationship information between the premise and hypothesis sentences
processed by the NLI module. Subsequently, the information is processed by a final
module to draw conclusions.
In the implementation of the proposed model architecture, NLI and KG inputs are
processed separately, and then the respective representative vectors from these
inputs are combined to form a final vector. The final vector is then used as input to
the classifier to produce the final result. The resulting model is trained using the
Indonesian COVID-19 NLI dataset and the Indonesian COVID-19 KG.
The research findings demonstrate that using KGs within the proposed model
architecture can improve NLI performance in validating information veracity. The
highest achievable accuracy reaches 0,8616 in validating the COVID-19
information in Indonesian language. KGs provide additional information that can
strengthen the validation of information veracity, rather than solely relying on the
semantic relationships formed. Therefore, KGs can serve as a crucial component
iv
in an automated fact-checking system to validate information veracity based on
factual evidence. |
---|