DEVELOPMENT OF QUESTION ANSWERING SYSTEM ON AL-QUR'AN TRANSLATION USING LARGE LANGUAGE MODEL

This research aims to develop a Question Answering (QA) System on Al-Qur'an translation using Large Language Model (LLM). This system is designed to facilitate the understanding of the Holy Qur'an, especially for new converts to Islam. In the context of Indonesia, as a country with the...

Full description

Saved in:
Bibliographic Details
Main Author: Restu Maulana, Diky
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/86386
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:86386
spelling id-itb.:863862024-09-18T08:04:29ZDEVELOPMENT OF QUESTION ANSWERING SYSTEM ON AL-QUR'AN TRANSLATION USING LARGE LANGUAGE MODEL Restu Maulana, Diky Indonesia Final Project Question Answering System, Al-Qur’an, Large Language Model, Retrieval-Augmented Generation, IndoBERT, GPT, IndoQRCD, Qur’an QA INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/86386 This research aims to develop a Question Answering (QA) System on Al-Qur'an translation using Large Language Model (LLM). This system is designed to facilitate the understanding of the Holy Qur'an, especially for new converts to Islam. In the context of Indonesia, as a country with the largest muslim population in the world, there is a need for a system that can answer questions about Islamic knowledge contained in the Holy Qur'an. The research process involved creating the IndoQRCD dataset which is a translation of the Qur'anic Reading Comprehension Dataset (QRCD) into Indonesian. This dataset was used to perform fine-tuning on two pre-trained models, namely XLM- RoBERTa and IndoBERT. Test results with exact match and F1 score metrics show IndoBERT is better at producing the right answer based on the given context. QA system is built with Retrieval-Augmented Generation (RAG) architecture and named Qur'an QA. Vector store is created as a knowledge base and also acts as a retriever that is able to search with similarity search algorithm. Qur'an QA is able to perform two types of QA, namely extractive and generative. The extractive QA generator uses fine-tuned IndoBERT. Meanwhile, GPT-4 is chosen as the generator for generative QA. Qur'an QA is able to receive input in the form of questions about Islam in Indonesian. Then, the system provides answers based on the context given in the form of Qur'anic verse quotations. Test results with answer relevancy and faithfulness metrics show that generative QA is better than extractive QA in generating relevant answers and minimizing hallucinations. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description This research aims to develop a Question Answering (QA) System on Al-Qur'an translation using Large Language Model (LLM). This system is designed to facilitate the understanding of the Holy Qur'an, especially for new converts to Islam. In the context of Indonesia, as a country with the largest muslim population in the world, there is a need for a system that can answer questions about Islamic knowledge contained in the Holy Qur'an. The research process involved creating the IndoQRCD dataset which is a translation of the Qur'anic Reading Comprehension Dataset (QRCD) into Indonesian. This dataset was used to perform fine-tuning on two pre-trained models, namely XLM- RoBERTa and IndoBERT. Test results with exact match and F1 score metrics show IndoBERT is better at producing the right answer based on the given context. QA system is built with Retrieval-Augmented Generation (RAG) architecture and named Qur'an QA. Vector store is created as a knowledge base and also acts as a retriever that is able to search with similarity search algorithm. Qur'an QA is able to perform two types of QA, namely extractive and generative. The extractive QA generator uses fine-tuned IndoBERT. Meanwhile, GPT-4 is chosen as the generator for generative QA. Qur'an QA is able to receive input in the form of questions about Islam in Indonesian. Then, the system provides answers based on the context given in the form of Qur'anic verse quotations. Test results with answer relevancy and faithfulness metrics show that generative QA is better than extractive QA in generating relevant answers and minimizing hallucinations.
format Final Project
author Restu Maulana, Diky
spellingShingle Restu Maulana, Diky
DEVELOPMENT OF QUESTION ANSWERING SYSTEM ON AL-QUR'AN TRANSLATION USING LARGE LANGUAGE MODEL
author_facet Restu Maulana, Diky
author_sort Restu Maulana, Diky
title DEVELOPMENT OF QUESTION ANSWERING SYSTEM ON AL-QUR'AN TRANSLATION USING LARGE LANGUAGE MODEL
title_short DEVELOPMENT OF QUESTION ANSWERING SYSTEM ON AL-QUR'AN TRANSLATION USING LARGE LANGUAGE MODEL
title_full DEVELOPMENT OF QUESTION ANSWERING SYSTEM ON AL-QUR'AN TRANSLATION USING LARGE LANGUAGE MODEL
title_fullStr DEVELOPMENT OF QUESTION ANSWERING SYSTEM ON AL-QUR'AN TRANSLATION USING LARGE LANGUAGE MODEL
title_full_unstemmed DEVELOPMENT OF QUESTION ANSWERING SYSTEM ON AL-QUR'AN TRANSLATION USING LARGE LANGUAGE MODEL
title_sort development of question answering system on al-qur'an translation using large language model
url https://digilib.itb.ac.id/gdl/view/86386
_version_ 1822999528170586112