FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH
In recent years, many researchers have focused on improving dialogue systems in both task-oriented dialogue (ToD) and chitchat domains. However, to create a dialogue system that truly mimics human conversation, we need to combine these objectives, as natural human interactions usually involve bot...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/84274 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:84274 |
---|---|
spelling |
id-itb.:842742024-08-15T07:17:33ZFUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH Dehan Al Kautsar, Muhammad Indonesia Theses Fused dialogue system, Open-source, Large language model, Belief span, Task-oriented Dialogue, Chitchat INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/84274 In recent years, many researchers have focused on improving dialogue systems in both task-oriented dialogue (ToD) and chitchat domains. However, to create a dialogue system that truly mimics human conversation, we need to combine these objectives, as natural human interactions usually involve both engaging and informative elements. So far, there has been no significant work addressing this challenge using large language models, particularly open-source ones. This thesis offers a comprehensive exploration of the fused dialogue system task using an open-source large language model, along with an analysis of the model's performance and error patterns. Additionally, this thesis proposes a robust framework to tackle this task using just a belief span 'bpsn'. Our approach opens new possibilitie and provides several insights for future research in this domain. The experiments in this thesis include experiments with the proposed mdc-fuse architecture using Mistral-7B instruct and GODEL-base, the use of naive approaches in several subtasks such as dialogue state tracking, response generation, and end-to-end dialogue on two types of open-source LLMs, namely Mistral-7B instruct and Llama-3, to observe their comparison, and analyzing compatibility between InstructTODS and smaller open-source LLMs such as Mistral-7B instruct, followed by several additional analysis in fused dialogue system. The experimental results indicate that the choice of language models and training methods in the mdc-fuse architecture significantly impacts its performance. Furthermore, the type of open-source LLMs used also plays a critical role. Specifically, architectures utilizing LLMs like Llama-3 tend to achieve higher score metrics compared to those using Mistral-7B instruct. Among the cases studied, mdc-fuse with GODEL-base demonstrated the best performance when compared to Mistral-7B instruct. Additionally, the experiments reveal that increasing the number of shots and limiting the dialogue context can further enhance the architecture's performance. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
In recent years, many researchers have focused on improving dialogue systems in
both task-oriented dialogue (ToD) and chitchat domains. However, to create a
dialogue system that truly mimics human conversation, we need to combine these
objectives, as natural human interactions usually involve both engaging and
informative elements. So far, there has been no significant work addressing this
challenge using large language models, particularly open-source ones.
This thesis offers a comprehensive exploration of the fused dialogue system task
using an open-source large language model, along with an analysis of the model's
performance and error patterns. Additionally, this thesis proposes a robust
framework to tackle this task using just a belief span 'bpsn'. Our approach opens
new possibilitie and provides several insights for future research in this domain.
The experiments in this thesis include experiments with the proposed mdc-fuse
architecture using Mistral-7B instruct and GODEL-base, the use of naive
approaches in several subtasks such as dialogue state tracking, response
generation, and end-to-end dialogue on two types of open-source LLMs, namely
Mistral-7B instruct and Llama-3, to observe their comparison, and analyzing
compatibility between InstructTODS and smaller open-source LLMs such as
Mistral-7B instruct, followed by several additional analysis in fused dialogue
system.
The experimental results indicate that the choice of language models and training
methods in the mdc-fuse architecture significantly impacts its performance.
Furthermore, the type of open-source LLMs used also plays a critical role.
Specifically, architectures utilizing LLMs like Llama-3 tend to achieve higher score
metrics compared to those using Mistral-7B instruct. Among the cases studied,
mdc-fuse with GODEL-base demonstrated the best performance when compared to
Mistral-7B instruct. Additionally, the experiments reveal that increasing the
number of shots and limiting the dialogue context can further enhance the
architecture's performance. |
format |
Theses |
author |
Dehan Al Kautsar, Muhammad |
spellingShingle |
Dehan Al Kautsar, Muhammad FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH |
author_facet |
Dehan Al Kautsar, Muhammad |
author_sort |
Dehan Al Kautsar, Muhammad |
title |
FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH |
title_short |
FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH |
title_full |
FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH |
title_fullStr |
FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH |
title_full_unstemmed |
FUSED DIALOGUE SYSTEM ON OPEN-SOURCE LARGE LANGUAGE MODEL USING END-TO-END APPROACH |
title_sort |
fused dialogue system on open-source large language model using end-to-end approach |
url |
https://digilib.itb.ac.id/gdl/view/84274 |
_version_ |
1822998498652454912 |