Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system

Visual articulatory feedback (VAF) are systems that provide visual representations of the user's articulations as feedback and have been shown to help in second language learning and speech therapy for people with hearing impairment. However, one of their current limitations is they do not give...

Full description

Saved in:

Bibliographic Details
Main Author:	Agustin, Natalie S.
Format:	text
Language:	English
Published:	Animo Repository 2014
Online Access:	https://animorepository.dlsu.edu.ph/etd_masteral/4636
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	De La Salle University
Language:	English

id	oai:animorepository.dlsu.edu.ph:etd_masteral-11474
record_format	eprints
spelling	oai:animorepository.dlsu.edu.ph:etd_masteral-114742022-06-22T03:04:00Z Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system Agustin, Natalie S. Visual articulatory feedback (VAF) are systems that provide visual representations of the user's articulations as feedback and have been shown to help in second language learning and speech therapy for people with hearing impairment. However, one of their current limitations is they do not give feedback on how to correct articulations. This can be overcome by showing which articulators are being used vis- a-vis the correct articulators through acoustic-to-articulatory inversion. Researches on solving this problem have been continuing for the last forty years because of its one-to-many nature and its non-linearity. However, most of these are not easily applicable to VAFs and are focused on sounds made by non-hearing impaired people. By using a combination of Self-Organizing Maps (SOMs) and regression models, this research aims to solve the acoustic-to-articulatory inversion problem as input to the creation of a VAF for the hearing impaired. The models are created using acoustic and articulatory data from the MOCHA-TIMIT database. Acoustic data are represented using Mel Frequency Cepstral Coe - cients (MFCCs), while articulatory data are represented using Cartesian coordinates of the di erent articulators. Video and audio data from people with hearing impairment are also collected for testing. Using the models created, the values of the articulatory parameters from the audio data are derived. Aside from that, speci c articulators that should be adjusted to produce the target sound will also be provided. Visualization was done by plotting the Cartesian coordinates of the articulatory features and overlaying a side view of the vocal tract to it. Results showed that the inversion methodology explained here does not improve significant improvement to existing methods. People with hearing impairment have di erent strengths and weaknesses in regards to the sounds they pronounce (the rst subject is good at 'o' vowels while the second is good at 'e' vowels). Also, there are only slight di erences to the predicted articulatory positions of the hearing impaired compared to the target. However, the sounds produce by the hearing impaired have a nasal quality, which is not captured by the model due to lack of data. 2014-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/4636 Master's Theses English Animo Repository
institution	De La Salle University
building	De La Salle University Library
continent	Asia
country	Philippines Philippines
content_provider	De La Salle University Library
collection	DLSU Institutional Repository
language	English
description	Visual articulatory feedback (VAF) are systems that provide visual representations of the user's articulations as feedback and have been shown to help in second language learning and speech therapy for people with hearing impairment. However, one of their current limitations is they do not give feedback on how to correct articulations. This can be overcome by showing which articulators are being used vis- a-vis the correct articulators through acoustic-to-articulatory inversion. Researches on solving this problem have been continuing for the last forty years because of its one-to-many nature and its non-linearity. However, most of these are not easily applicable to VAFs and are focused on sounds made by non-hearing impaired people. By using a combination of Self-Organizing Maps (SOMs) and regression models, this research aims to solve the acoustic-to-articulatory inversion problem as input to the creation of a VAF for the hearing impaired. The models are created using acoustic and articulatory data from the MOCHA-TIMIT database. Acoustic data are represented using Mel Frequency Cepstral Coe - cients (MFCCs), while articulatory data are represented using Cartesian coordinates of the di erent articulators. Video and audio data from people with hearing impairment are also collected for testing. Using the models created, the values of the articulatory parameters from the audio data are derived. Aside from that, speci c articulators that should be adjusted to produce the target sound will also be provided. Visualization was done by plotting the Cartesian coordinates of the articulatory features and overlaying a side view of the vocal tract to it. Results showed that the inversion methodology explained here does not improve significant improvement to existing methods. People with hearing impairment have di erent strengths and weaknesses in regards to the sounds they pronounce (the rst subject is good at 'o' vowels while the second is good at 'e' vowels). Also, there are only slight di erences to the predicted articulatory positions of the hearing impaired compared to the target. However, the sounds produce by the hearing impaired have a nasal quality, which is not captured by the model due to lack of data.
format	text
author	Agustin, Natalie S.
spellingShingle	Agustin, Natalie S. Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
author_facet	Agustin, Natalie S.
author_sort	Agustin, Natalie S.
title	Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_short	Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_full	Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_fullStr	Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_full_unstemmed	Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_sort	using self-organizing maps and regression to solve the acoustic-to-articulatory inversion as input to a visual articulatory feedback system
publisher	Animo Repository
publishDate	2014
url	https://animorepository.dlsu.edu.ph/etd_masteral/4636
_version_	1736864192767459328

Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system

Similar Items