Punctuation restoration for speech transcripts using large language models

This thesis explores punctuation restoration in speech transcripts using Large Language Models (LLMs) to enhance text readability and comprehension. We focus on the efficacy of LLMs, specifically XLM-RoBERTa and Llama-2. The primary contributions include the refinement of the existing XLM-RoBERTa mo...

Full description

Saved in:
Bibliographic Details
Main Author: Liu, Changsong
Other Authors: Chng Eng Siong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175306
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-175306
record_format dspace
spelling sg-ntu-dr.10356-1753062024-04-26T15:43:37Z Punctuation restoration for speech transcripts using large language models Liu, Changsong Chng Eng Siong School of Computer Science and Engineering Temasek Laboratories @ NTU ASESChng@ntu.edu.sg Computer and Information Science Punctuation restoration Large language models Natural language processing This thesis explores punctuation restoration in speech transcripts using Large Language Models (LLMs) to enhance text readability and comprehension. We focus on the efficacy of LLMs, specifically XLM-RoBERTa and Llama-2. The primary contributions include the refinement of the existing XLM-RoBERTa model and the fine-tuning of Llama-2, which possesses 13 billion parameters, using several advanced techniques. For the XLM-RoBERTa model, we implement an evaluation pipeline and utilize a model checkpoint ensemble technique that improved its F1-Score by 3%. The fine-tuned Llama-2 model incorporates prompt engineering and Low-Rank Adaptation (LoRA), achieving an F1-Score of 0.73. This score indicates performance that is comparable and even superior across all punctuation classes compared to Google’s state-of-the-art Gemini model. Additionally, this project develops and details the fine-tuning processes, data processing strategies, and standardized evaluation methodologies for different LLMs. Our experimental analysis provides a thorough evaluation of model performance and draws meaningful conclusions. Based on the refined model architecture and research conducted in this project, two papers have been published and accepted to ACIIDS and ICAICTA conferences. Future work will extend the scope of LLM evaluations to include additional datasets and will focus on further refining and fine-tuning the models to address challenges that emerge during our experiments. Bachelor's degree 2024-04-22T08:53:25Z 2024-04-22T08:53:25Z 2024 Final Year Project (FYP) Liu, C. (2024). Punctuation restoration for speech transcripts using large language models. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175306 https://hdl.handle.net/10356/175306 en SCSE23-0753 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Punctuation restoration
Large language models
Natural language processing
spellingShingle Computer and Information Science
Punctuation restoration
Large language models
Natural language processing
Liu, Changsong
Punctuation restoration for speech transcripts using large language models
description This thesis explores punctuation restoration in speech transcripts using Large Language Models (LLMs) to enhance text readability and comprehension. We focus on the efficacy of LLMs, specifically XLM-RoBERTa and Llama-2. The primary contributions include the refinement of the existing XLM-RoBERTa model and the fine-tuning of Llama-2, which possesses 13 billion parameters, using several advanced techniques. For the XLM-RoBERTa model, we implement an evaluation pipeline and utilize a model checkpoint ensemble technique that improved its F1-Score by 3%. The fine-tuned Llama-2 model incorporates prompt engineering and Low-Rank Adaptation (LoRA), achieving an F1-Score of 0.73. This score indicates performance that is comparable and even superior across all punctuation classes compared to Google’s state-of-the-art Gemini model. Additionally, this project develops and details the fine-tuning processes, data processing strategies, and standardized evaluation methodologies for different LLMs. Our experimental analysis provides a thorough evaluation of model performance and draws meaningful conclusions. Based on the refined model architecture and research conducted in this project, two papers have been published and accepted to ACIIDS and ICAICTA conferences. Future work will extend the scope of LLM evaluations to include additional datasets and will focus on further refining and fine-tuning the models to address challenges that emerge during our experiments.
author2 Chng Eng Siong
author_facet Chng Eng Siong
Liu, Changsong
format Final Year Project
author Liu, Changsong
author_sort Liu, Changsong
title Punctuation restoration for speech transcripts using large language models
title_short Punctuation restoration for speech transcripts using large language models
title_full Punctuation restoration for speech transcripts using large language models
title_fullStr Punctuation restoration for speech transcripts using large language models
title_full_unstemmed Punctuation restoration for speech transcripts using large language models
title_sort punctuation restoration for speech transcripts using large language models
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/175306
_version_ 1800916244238434304