VALIDATION OF COVID-19 INFORMATION IN INDONESIAN LANGUAGE USING A NATURAL LANGUAGE INFERENCE AND KNOWLEDGE GRAPH-BASED APPROACH

COVID-19 has emerged as a global public health concern due to its rapid spread and high mortality rate. Effective community-level control of this disease hinges on the availability of accurate information. However, the massive spread of misinformation through the internet makes individuals suscep...

Full description

Saved in:

Bibliographic Details
Main Author:	Purnama Muharram, Arief
Format:	Theses
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/81541
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

Description
Summary:	COVID-19 has emerged as a global public health concern due to its rapid spread and high mortality rate. Effective community-level control of this disease hinges on the availability of accurate information. However, the massive spread of misinformation through the internet makes individuals susceptible to false information. Automated fact-checking systems can play a crucial role in assisting with the validation of information veracity in such cases. These systems typically involve a Natural Language Inference (NLI) approach to identify the semantic relationship between a proven-true premise sentence and a hypothesis sentence representing the claim to be verified. This semantic relationship can be entailment, contradiction, or neutral. An information item is considered true if an entailment relationship exists between the sentences. This research introduces the use of Knowledge Graphs (KGs) to enhance the performance of NLI in validating information veracity. KGs serve to augment factual information, enabling the model to draw conclusions about the truthfulness of information based on factual evidence. In the proposed model architecture, information from the KG is processed in a separate module and combined with the semantic relationship information between the premise and hypothesis sentences processed by the NLI module. Subsequently, the information is processed by a final module to draw conclusions. In the implementation of the proposed model architecture, NLI and KG inputs are processed separately, and then the respective representative vectors from these inputs are combined to form a final vector. The final vector is then used as input to the classifier to produce the final result. The resulting model is trained using the Indonesian COVID-19 NLI dataset and the Indonesian COVID-19 KG. The research findings demonstrate that using KGs within the proposed model architecture can improve NLI performance in validating information veracity. The highest achievable accuracy reaches 0,8616 in validating the COVID-19 information in Indonesian language. KGs provide additional information that can strengthen the validation of information veracity, rather than solely relying on the semantic relationships formed. Therefore, KGs can serve as a crucial component iv in an automated fact-checking system to validate information veracity based on factual evidence.

VALIDATION OF COVID-19 INFORMATION IN INDONESIAN LANGUAGE USING A NATURAL LANGUAGE INFERENCE AND KNOWLEDGE GRAPH-BASED APPROACH

Similar Items