FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH

In recent years, many researchers have focused on improving dialogue systems in both task-oriented dialogue (ToD) and chitchat domains. However, to create a dialogue system that truly mimics human conversation, we need to combine these objectives, as natural human interactions usually involve bot...

Full description

Saved in:
Bibliographic Details
Main Author: Dehan Al Kautsar, Muhammad
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/84274
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:84274
spelling id-itb.:842742024-08-15T07:17:33ZFUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH Dehan Al Kautsar, Muhammad Indonesia Theses Fused dialogue system, Open-source, Large language model, Belief span, Task-oriented Dialogue, Chitchat INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/84274 In recent years, many researchers have focused on improving dialogue systems in both task-oriented dialogue (ToD) and chitchat domains. However, to create a dialogue system that truly mimics human conversation, we need to combine these objectives, as natural human interactions usually involve both engaging and informative elements. So far, there has been no significant work addressing this challenge using large language models, particularly open-source ones. This thesis offers a comprehensive exploration of the fused dialogue system task using an open-source large language model, along with an analysis of the model's performance and error patterns. Additionally, this thesis proposes a robust framework to tackle this task using just a belief span 'bpsn'. Our approach opens new possibilitie and provides several insights for future research in this domain. The experiments in this thesis include experiments with the proposed mdc-fuse architecture using Mistral-7B instruct and GODEL-base, the use of naive approaches in several subtasks such as dialogue state tracking, response generation, and end-to-end dialogue on two types of open-source LLMs, namely Mistral-7B instruct and Llama-3, to observe their comparison, and analyzing compatibility between InstructTODS and smaller open-source LLMs such as Mistral-7B instruct, followed by several additional analysis in fused dialogue system. The experimental results indicate that the choice of language models and training methods in the mdc-fuse architecture significantly impacts its performance. Furthermore, the type of open-source LLMs used also plays a critical role. Specifically, architectures utilizing LLMs like Llama-3 tend to achieve higher score metrics compared to those using Mistral-7B instruct. Among the cases studied, mdc-fuse with GODEL-base demonstrated the best performance when compared to Mistral-7B instruct. Additionally, the experiments reveal that increasing the number of shots and limiting the dialogue context can further enhance the architecture's performance. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description In recent years, many researchers have focused on improving dialogue systems in both task-oriented dialogue (ToD) and chitchat domains. However, to create a dialogue system that truly mimics human conversation, we need to combine these objectives, as natural human interactions usually involve both engaging and informative elements. So far, there has been no significant work addressing this challenge using large language models, particularly open-source ones. This thesis offers a comprehensive exploration of the fused dialogue system task using an open-source large language model, along with an analysis of the model's performance and error patterns. Additionally, this thesis proposes a robust framework to tackle this task using just a belief span 'bpsn'. Our approach opens new possibilitie and provides several insights for future research in this domain. The experiments in this thesis include experiments with the proposed mdc-fuse architecture using Mistral-7B instruct and GODEL-base, the use of naive approaches in several subtasks such as dialogue state tracking, response generation, and end-to-end dialogue on two types of open-source LLMs, namely Mistral-7B instruct and Llama-3, to observe their comparison, and analyzing compatibility between InstructTODS and smaller open-source LLMs such as Mistral-7B instruct, followed by several additional analysis in fused dialogue system. The experimental results indicate that the choice of language models and training methods in the mdc-fuse architecture significantly impacts its performance. Furthermore, the type of open-source LLMs used also plays a critical role. Specifically, architectures utilizing LLMs like Llama-3 tend to achieve higher score metrics compared to those using Mistral-7B instruct. Among the cases studied, mdc-fuse with GODEL-base demonstrated the best performance when compared to Mistral-7B instruct. Additionally, the experiments reveal that increasing the number of shots and limiting the dialogue context can further enhance the architecture's performance.
format Theses
author Dehan Al Kautsar, Muhammad
spellingShingle Dehan Al Kautsar, Muhammad
FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH
author_facet Dehan Al Kautsar, Muhammad
author_sort Dehan Al Kautsar, Muhammad
title FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH
title_short FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH
title_full FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH
title_fullStr FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH
title_full_unstemmed FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH
title_sort fused dialogue system on open-source large language model using end-to-end approach
url https://digilib.itb.ac.id/gdl/view/84274
_version_ 1822998498652454912