Severity-based adaptation with limited data for ASR to aid dysarthric speakers

Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for...

Full description

Saved in:

Bibliographic Details
Main Authors:	Mustafa, Mumtaz Begum, Salim, Siti Salwah, Mohamed, Noraini, Al-Qatab, Bassam, Siong, Chng Eng
Other Authors:	Snyder, Joel
Format:	Article
Language:	English
Published:	2014
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/97554 http://hdl.handle.net/10220/19606
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-97554
record_format	dspace
spelling	sg-ntu-dr.10356-975542022-02-16T16:29:58Z Severity-based adaptation with limited data for ASR to aid dysarthric speakers Mustafa, Mumtaz Begum Salim, Siti Salwah Mohamed, Noraini Al-Qatab, Bassam Siong, Chng Eng Snyder, Joel School of Computer Engineering DRNTU::Engineering::Computer science and engineering Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data. Published version 2014-06-10T03:15:15Z 2019-12-06T19:44:00Z 2014-06-10T03:15:15Z 2019-12-06T19:44:00Z 2014 2014 Journal Article Mustafa, M. B., Salim, S. S., Mohamed, N., Al-Qatab, B., & Siong, C. E. (2014). Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers. PLoS ONE, 9(1), e86285-. 1932-6203 https://hdl.handle.net/10356/97554 http://hdl.handle.net/10220/19606 10.1371/journal.pone.0086285 24466004 en PLoS ONE © 2014 Mustafa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Mustafa, Mumtaz Begum Salim, Siti Salwah Mohamed, Noraini Al-Qatab, Bassam Siong, Chng Eng Severity-based adaptation with limited data for ASR to aid dysarthric speakers
description	Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data.
author2	Snyder, Joel
author_facet	Snyder, Joel Mustafa, Mumtaz Begum Salim, Siti Salwah Mohamed, Noraini Al-Qatab, Bassam Siong, Chng Eng
format	Article
author	Mustafa, Mumtaz Begum Salim, Siti Salwah Mohamed, Noraini Al-Qatab, Bassam Siong, Chng Eng
author_sort	Mustafa, Mumtaz Begum
title	Severity-based adaptation with limited data for ASR to aid dysarthric speakers
title_short	Severity-based adaptation with limited data for ASR to aid dysarthric speakers
title_full	Severity-based adaptation with limited data for ASR to aid dysarthric speakers
title_fullStr	Severity-based adaptation with limited data for ASR to aid dysarthric speakers
title_full_unstemmed	Severity-based adaptation with limited data for ASR to aid dysarthric speakers
title_sort	severity-based adaptation with limited data for asr to aid dysarthric speakers
publishDate	2014
url	https://hdl.handle.net/10356/97554 http://hdl.handle.net/10220/19606
_version_	1725985633870020608

Severity-based adaptation with limited data for ASR to aid dysarthric speakers

Similar Items