INTERACTIVE QUESTION-ANSWERING SYSTEM USING LARGE LANGUAGE MODEL AND RETRIEVAL-AUGMENTED GENERATION IN INTELLIGENT TUTORING SYSTEM ON THE PROGRAMMING DOMAIN

One of the main weaknesses of existing online programming learning platforms is the lack of interaction between students and mentors. In this final project, an intelligent tutoring system was developed with an interactive question-answering (QA) system to enable live interaction between students...

Full description

Saved in:
Bibliographic Details
Main Author: Christian Wijaya, Owen
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/82402
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:One of the main weaknesses of existing online programming learning platforms is the lack of interaction between students and mentors. In this final project, an intelligent tutoring system was developed with an interactive question-answering (QA) system to enable live interaction between students and a tutor model. The interactive QA system was developed using a large language model (LLM) and retrieval-augmented generation (RAG) to answer inquiry questions based on the learning materials. The pipeline for the QA system was developed using the LangChain framework and can be directly integrated into the website. Document processing was carried out to convert the learning materials into embeddings in a vector database. The RAG mechanism was utilized alongside prompt engineering to direct the model’s ability to answer with the context of programming learning. The performed evaluations are qualitative by comparison between the results of retrieval processes and qualitative evaluation towards the answers provided by the QA system. A subjective evaluation was performed by comparing the answers between 4-bit quantized LLMs in both single-turn and multiturn questions. Aside from subjective evaluation, an external evaluation was conducted by designing and the filling of questionnaires by fourteen respondents, using five questions and answers from each model as test data. Evaluation results shown that Llama 3 proved consistent results compared to other models, and the RAG results could be more effective by using larger-sized documents.