Towards explainable and semantically coherent claim extraction for an automated fact-checker

Misinformation and fake news spread everywhere through online social media platforms. Although Automatic Fact-Checkers and LLMs like ChatGPT have become popular and seem to be a promising solution to detect fake news, these models still have some limitations regarding their reliance on pre-existing...

Full description

Saved in:

Bibliographic Details
Main Author:	Yoswara, Jocelyn Valencia
Other Authors:	Erry Gunawan
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Engineering Natural language processing Large language models Claim extraction Automatic fact checker Machine learning
Online Access:	https://hdl.handle.net/10356/176481
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	Misinformation and fake news spread everywhere through online social media platforms. Although Automatic Fact-Checkers and LLMs like ChatGPT have become popular and seem to be a promising solution to detect fake news, these models still have some limitations regarding their reliance on pre-existing knowledge, and concerns are raised about whether they can differentiate between truth and falsehood. As these concerns arise, the claim extraction process of obtaining claims made from various resources becomes one of the most crucial steps in an automatic fact-checker. Therefore, this project aims to enhance the claim extraction process of an automatic fact-checker, improve the accuracy of fact-checking, and mitigate the limitations associated with LLMs' reliance on outdated information. Firstly, a baseline model was created using SBert with an evidence retrieval system. Secondly, an evidence retrieval system was implemented using the Google Search API to gather evidence related to claims from online sources. Lastly, the GPT-4 model was utilized to verify claims based on the available evidence. The GPT-4 model outperforms the baseline model, achieving a 94% accuracy in claim verification. However, the baseline model provides more comprehensive insights into the coverage of the entire dataset, and it is also found that the evidence retrieval process significantly affects the model's accuracy and coverage of claim verification tasks. Hence, this project represents an enhancement in the claim extraction process while also identifying limitations and suggesting potential areas for further improvements to improve the effectiveness of claim extraction for an automatic fact-checker.

Towards explainable and semantically coherent claim extraction for an automated fact-checker

Similar Items