Developing a graphene Q&A chatbot using retrieval augmented generation (RAG)

Graphene synthesis is a rapidly growing market with various methods for different applications. However, the mass production of high-quality graphene that is cost-effective and environmentally sustainable has not been established commercially. Current graphene synthesis techniques also face issues r...

Full description

Saved in:

Bibliographic Details
Main Author:	Sara Johari
Other Authors:	Leonard Ng Wei Tat
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Engineering
Online Access:	https://hdl.handle.net/10356/175994
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-175994
record_format	dspace
spelling	sg-ntu-dr.10356-1759942024-05-18T16:45:49Z Developing a graphene Q&A chatbot using retrieval augmented generation (RAG) Sara Johari Leonard Ng Wei Tat School of Materials Science and Engineering leonard.ngwt@ntu.edu.sg Engineering Graphene synthesis is a rapidly growing market with various methods for different applications. However, the mass production of high-quality graphene that is cost-effective and environmentally sustainable has not been established commercially. Current graphene synthesis techniques also face issues related to reproducibility. Recently, the proliferation of artificial intelligence (AI) with ever-evolving large language models (LLMs), along with the emergence of the Retrieval Augmented Generation (RAG) approach, has demonstrated significant abilities to produce natural responses with a vast amount of knowledge. Therefore, there is an interest in combining the database of graphene synthesis with AI to remarkably assist in the research process. This experimental study tested the use of UMAP visualizations to determine the optimal chunk size and overlap. Subsequently, two LLMs, the DRAGON Deci-7B LLM and the DRAGON Mistral-7B LLM, were tested within a RAG question-answering chatbot architecture. The chatbots were then further evaluated with two advanced retrieval methods: the parent document retriever and the ensemble retriever. The chatbots were evaluated by RAGAs, a performance metric framework, with ChatGPT as a benchmark using a synthetic dataset of 10 questions and corresponding ground truths. Human evaluation was also conducted by manually inputting a user prompt into the chatbots and analysing the response generated. In summary, through LLM evaluations with ChatGPT, the optimal chatbot developed in this study utilized the DRAGON Mistral-7B LLM with the parent document retrieval method, with an embedded chunk size of 256 tokens and a 10% overlap. However, human evaluations raised concerns with regards to the actual useability of the chatbot. Further troubleshooting and refinement would be necessary, but this was constrained by the costs associated with the project. Bachelor's degree 2024-05-12T23:55:09Z 2024-05-12T23:55:09Z 2024 Final Year Project (FYP) Sara Johari (2024). Developing a graphene Q&A chatbot using retrieval augmented generation (RAG). Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175994 https://hdl.handle.net/10356/175994 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering
spellingShingle	Engineering Sara Johari Developing a graphene Q&A chatbot using retrieval augmented generation (RAG)
description	Graphene synthesis is a rapidly growing market with various methods for different applications. However, the mass production of high-quality graphene that is cost-effective and environmentally sustainable has not been established commercially. Current graphene synthesis techniques also face issues related to reproducibility. Recently, the proliferation of artificial intelligence (AI) with ever-evolving large language models (LLMs), along with the emergence of the Retrieval Augmented Generation (RAG) approach, has demonstrated significant abilities to produce natural responses with a vast amount of knowledge. Therefore, there is an interest in combining the database of graphene synthesis with AI to remarkably assist in the research process. This experimental study tested the use of UMAP visualizations to determine the optimal chunk size and overlap. Subsequently, two LLMs, the DRAGON Deci-7B LLM and the DRAGON Mistral-7B LLM, were tested within a RAG question-answering chatbot architecture. The chatbots were then further evaluated with two advanced retrieval methods: the parent document retriever and the ensemble retriever. The chatbots were evaluated by RAGAs, a performance metric framework, with ChatGPT as a benchmark using a synthetic dataset of 10 questions and corresponding ground truths. Human evaluation was also conducted by manually inputting a user prompt into the chatbots and analysing the response generated. In summary, through LLM evaluations with ChatGPT, the optimal chatbot developed in this study utilized the DRAGON Mistral-7B LLM with the parent document retrieval method, with an embedded chunk size of 256 tokens and a 10% overlap. However, human evaluations raised concerns with regards to the actual useability of the chatbot. Further troubleshooting and refinement would be necessary, but this was constrained by the costs associated with the project.
author2	Leonard Ng Wei Tat
author_facet	Leonard Ng Wei Tat Sara Johari
format	Final Year Project
author	Sara Johari
author_sort	Sara Johari
title	Developing a graphene Q&A chatbot using retrieval augmented generation (RAG)
title_short	Developing a graphene Q&A chatbot using retrieval augmented generation (RAG)
title_full	Developing a graphene Q&A chatbot using retrieval augmented generation (RAG)
title_fullStr	Developing a graphene Q&A chatbot using retrieval augmented generation (RAG)
title_full_unstemmed	Developing a graphene Q&A chatbot using retrieval augmented generation (RAG)
title_sort	developing a graphene q&a chatbot using retrieval augmented generation (rag)
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/175994
_version_	1814047247083503616

Developing a graphene Q&A chatbot using retrieval augmented generation (RAG)

Similar Items