Heuristic development in the use of large language models for materials science

This project seeks to make use of the benefits of Artificial Intelligence, specifically large language models (LLMs), within the field of materials science and engineering. The overarching objective is to enhance the toolkit available to materials researchers, aiming to optimize research efficien...

Full description

Saved in:
Bibliographic Details
Main Author: Cho, Zen Han
Other Authors: Leonard Ng Wei Tat
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/176095
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This project seeks to make use of the benefits of Artificial Intelligence, specifically large language models (LLMs), within the field of materials science and engineering. The overarching objective is to enhance the toolkit available to materials researchers, aiming to optimize research efficiency. The project achieves its objective by developing a specialized chatbot designed to respond to inquiries related to fabrication of graphene. The project leverages on graphene as it is a material of significant interest and importance within the discipline. Grounded in the principles of retrieval augmented generation (RAG) and prompt engineering, this chatbot ensures reliability and accuracy in its responses. Furthermore, the project conducts a comparative analysis of multiple Large Language Models (LLMs) to identify the optimal model to be used. The project crafted 10 questions pertaining to graphene and devised a scoring matrix to assess the performance of the employed LLMs. To minimize external variables, consistent model parameters were maintained across experiments, ensuring that obtained scores were primarily indicative of LLM performance. The software platforms utilized included Amazon Web Services and Google Colab. The project conducted a comparative analysis of model outputs with and without the integration of RAG and prompt engineering techniques. The findings indicated that outputs generated using these techniques demonstrated significantly improved accuracy and informativeness. Furthermore, an analytical evaluation was undertaken to assess the performance of four Large Language Models (LLMs) in addressing graphene-related queries, revealing that Google's Gemini-pro model outperformed the others.