Punctuation restoration for speech transcripts using large language models
This thesis explores punctuation restoration in speech transcripts using Large Language Models (LLMs) to enhance text readability and comprehension. We focus on the efficacy of LLMs, specifically XLM-RoBERTa and Llama-2. The primary contributions include the refinement of the existing XLM-RoBERTa mo...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175306 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-175306 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1753062024-04-26T15:43:37Z Punctuation restoration for speech transcripts using large language models Liu, Changsong Chng Eng Siong School of Computer Science and Engineering Temasek Laboratories @ NTU ASESChng@ntu.edu.sg Computer and Information Science Punctuation restoration Large language models Natural language processing This thesis explores punctuation restoration in speech transcripts using Large Language Models (LLMs) to enhance text readability and comprehension. We focus on the efficacy of LLMs, specifically XLM-RoBERTa and Llama-2. The primary contributions include the refinement of the existing XLM-RoBERTa model and the fine-tuning of Llama-2, which possesses 13 billion parameters, using several advanced techniques. For the XLM-RoBERTa model, we implement an evaluation pipeline and utilize a model checkpoint ensemble technique that improved its F1-Score by 3%. The fine-tuned Llama-2 model incorporates prompt engineering and Low-Rank Adaptation (LoRA), achieving an F1-Score of 0.73. This score indicates performance that is comparable and even superior across all punctuation classes compared to Google’s state-of-the-art Gemini model. Additionally, this project develops and details the fine-tuning processes, data processing strategies, and standardized evaluation methodologies for different LLMs. Our experimental analysis provides a thorough evaluation of model performance and draws meaningful conclusions. Based on the refined model architecture and research conducted in this project, two papers have been published and accepted to ACIIDS and ICAICTA conferences. Future work will extend the scope of LLM evaluations to include additional datasets and will focus on further refining and fine-tuning the models to address challenges that emerge during our experiments. Bachelor's degree 2024-04-22T08:53:25Z 2024-04-22T08:53:25Z 2024 Final Year Project (FYP) Liu, C. (2024). Punctuation restoration for speech transcripts using large language models. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175306 https://hdl.handle.net/10356/175306 en SCSE23-0753 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Punctuation restoration Large language models Natural language processing |
spellingShingle |
Computer and Information Science Punctuation restoration Large language models Natural language processing Liu, Changsong Punctuation restoration for speech transcripts using large language models |
description |
This thesis explores punctuation restoration in speech transcripts using Large Language Models (LLMs) to enhance text readability and comprehension. We focus on the efficacy of LLMs, specifically XLM-RoBERTa and Llama-2. The primary contributions include the refinement of the existing XLM-RoBERTa model and the fine-tuning of Llama-2, which possesses 13 billion parameters, using several advanced techniques. For the XLM-RoBERTa model, we implement an evaluation pipeline and utilize a model checkpoint ensemble technique that improved its F1-Score by 3%. The fine-tuned Llama-2 model incorporates prompt engineering and Low-Rank Adaptation (LoRA), achieving an F1-Score of 0.73. This score indicates performance that is comparable and even superior across all punctuation classes compared to Google’s state-of-the-art Gemini model. Additionally, this project develops and details the fine-tuning processes, data processing strategies, and standardized evaluation methodologies for different LLMs. Our experimental analysis provides a thorough evaluation of model performance and draws meaningful conclusions. Based on the refined model architecture and research conducted in this project, two papers have been published and accepted to ACIIDS and ICAICTA conferences. Future work will extend the scope of LLM evaluations to include additional datasets and will focus on further refining and fine-tuning the models to address challenges that emerge during our experiments. |
author2 |
Chng Eng Siong |
author_facet |
Chng Eng Siong Liu, Changsong |
format |
Final Year Project |
author |
Liu, Changsong |
author_sort |
Liu, Changsong |
title |
Punctuation restoration for speech transcripts using large language models |
title_short |
Punctuation restoration for speech transcripts using large language models |
title_full |
Punctuation restoration for speech transcripts using large language models |
title_fullStr |
Punctuation restoration for speech transcripts using large language models |
title_full_unstemmed |
Punctuation restoration for speech transcripts using large language models |
title_sort |
punctuation restoration for speech transcripts using large language models |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/175306 |
_version_ |
1800916244238434304 |