Impaired speech recognition

Dysarthria is a speech disorder which often leads to speech that is difficult to understand or easily misinterpreted, resulting in communication challenges for affected individuals. With recent advancements in automatic speech recognition technologies, it has the potential to assist dysarthric spe...

Full description

Saved in:
Bibliographic Details
Main Author: Lam, Michelle Su-Ann
Other Authors: Goh Wooi Boon
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181137
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181137
record_format dspace
spelling sg-ntu-dr.10356-1811372024-11-15T12:41:28Z Impaired speech recognition Lam, Michelle Su-Ann Goh Wooi Boon College of Computing and Data Science ASWBGOH@ntu.edu.sg Computer and Information Science Dysarthria is a speech disorder which often leads to speech that is difficult to understand or easily misinterpreted, resulting in communication challenges for affected individuals. With recent advancements in automatic speech recognition technologies, it has the potential to assist dysarthric speakers in their communication needs. This paper presents the development of an automatic speech recognition system that can effectively assist such individuals in their daily communication with their counterparts. The ability of two state-of-the-art models to accurately transcribe dysarthric speech will be explored. A large language model is incorporated as an automatic speech recognition correction system to further enhance the accuracy of resulting transcriptions. The effectiveness of the resulting automatic speech recognition system as a communication assistance tool will be demonstrated together with a text-to-speech synthesizer, with both components being integrated into a mobile application that aims to recreate clearly spoken words from the original dysarthric speech. While the resulting automatic speech recognition system faced challenges in generalizing across different dysarthric datasets, including a large language model into the process yielded positive outcomes as demonstrated by the qualitative user testing. Bachelor's degree 2024-11-15T12:41:28Z 2024-11-15T12:41:28Z 2024 Final Year Project (FYP) Lam, M. S. (2024). Impaired speech recognition. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181137 https://hdl.handle.net/10356/181137 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
spellingShingle Computer and Information Science
Lam, Michelle Su-Ann
Impaired speech recognition
description Dysarthria is a speech disorder which often leads to speech that is difficult to understand or easily misinterpreted, resulting in communication challenges for affected individuals. With recent advancements in automatic speech recognition technologies, it has the potential to assist dysarthric speakers in their communication needs. This paper presents the development of an automatic speech recognition system that can effectively assist such individuals in their daily communication with their counterparts. The ability of two state-of-the-art models to accurately transcribe dysarthric speech will be explored. A large language model is incorporated as an automatic speech recognition correction system to further enhance the accuracy of resulting transcriptions. The effectiveness of the resulting automatic speech recognition system as a communication assistance tool will be demonstrated together with a text-to-speech synthesizer, with both components being integrated into a mobile application that aims to recreate clearly spoken words from the original dysarthric speech. While the resulting automatic speech recognition system faced challenges in generalizing across different dysarthric datasets, including a large language model into the process yielded positive outcomes as demonstrated by the qualitative user testing.
author2 Goh Wooi Boon
author_facet Goh Wooi Boon
Lam, Michelle Su-Ann
format Final Year Project
author Lam, Michelle Su-Ann
author_sort Lam, Michelle Su-Ann
title Impaired speech recognition
title_short Impaired speech recognition
title_full Impaired speech recognition
title_fullStr Impaired speech recognition
title_full_unstemmed Impaired speech recognition
title_sort impaired speech recognition
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/181137
_version_ 1816859020801081344