DEVELOPMENT OF QUESTION ANSWERING SYSTEM ON AL-QUR'AN TRANSLATION USING LARGE LANGUAGE MODEL
This research aims to develop a Question Answering (QA) System on Al-Qur'an translation using Large Language Model (LLM). This system is designed to facilitate the understanding of the Holy Qur'an, especially for new converts to Islam. In the context of Indonesia, as a country with the...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/86386 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | This research aims to develop a Question Answering (QA) System on Al-Qur'an
translation using Large Language Model (LLM). This system is designed to
facilitate the understanding of the Holy Qur'an, especially for new converts to
Islam. In the context of Indonesia, as a country with the largest muslim population
in the world, there is a need for a system that can answer questions about Islamic
knowledge contained in the Holy Qur'an.
The research process involved creating the IndoQRCD dataset which is a translation
of the Qur'anic Reading Comprehension Dataset (QRCD) into Indonesian. This
dataset was used to perform fine-tuning on two pre-trained models, namely XLM-
RoBERTa and IndoBERT. Test results with exact match and F1 score metrics show
IndoBERT is better at producing the right answer based on the given context.
QA system is built with Retrieval-Augmented Generation (RAG) architecture and
named Qur'an QA. Vector store is created as a knowledge base and also acts as a
retriever that is able to search with similarity search algorithm. Qur'an QA is able
to perform two types of QA, namely extractive and generative. The extractive QA
generator uses fine-tuned IndoBERT. Meanwhile, GPT-4 is chosen as the generator
for generative QA.
Qur'an QA is able to receive input in the form of questions about Islam in
Indonesian. Then, the system provides answers based on the context given in the
form of Qur'anic verse quotations. Test results with answer relevancy and
faithfulness metrics show that generative QA is better than extractive QA in
generating relevant answers and minimizing hallucinations. |
---|