Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system

Visual articulatory feedback (VAF) are systems that provide visual representations of the user's articulations as feedback and have been shown to help in second language learning and speech therapy for people with hearing impairment. However, one of their current limitations is they do not give...

Full description

Saved in:
Bibliographic Details
Main Author: Agustin, Natalie S.
Format: text
Language:English
Published: Animo Repository 2014
Online Access:https://animorepository.dlsu.edu.ph/etd_masteral/4636
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_masteral-11474
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_masteral-114742022-06-22T03:04:00Z Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system Agustin, Natalie S. Visual articulatory feedback (VAF) are systems that provide visual representations of the user's articulations as feedback and have been shown to help in second language learning and speech therapy for people with hearing impairment. However, one of their current limitations is they do not give feedback on how to correct articulations. This can be overcome by showing which articulators are being used vis- a-vis the correct articulators through acoustic-to-articulatory inversion. Researches on solving this problem have been continuing for the last forty years because of its one-to-many nature and its non-linearity. However, most of these are not easily applicable to VAFs and are focused on sounds made by non-hearing impaired people. By using a combination of Self-Organizing Maps (SOMs) and regression models, this research aims to solve the acoustic-to-articulatory inversion problem as input to the creation of a VAF for the hearing impaired. The models are created using acoustic and articulatory data from the MOCHA-TIMIT database. Acoustic data are represented using Mel Frequency Cepstral Coe - cients (MFCCs), while articulatory data are represented using Cartesian coordinates of the di erent articulators. Video and audio data from people with hearing impairment are also collected for testing. Using the models created, the values of the articulatory parameters from the audio data are derived. Aside from that, speci c articulators that should be adjusted to produce the target sound will also be provided. Visualization was done by plotting the Cartesian coordinates of the articulatory features and overlaying a side view of the vocal tract to it. Results showed that the inversion methodology explained here does not improve significant improvement to existing methods. People with hearing impairment have di erent strengths and weaknesses in regards to the sounds they pronounce (the rst subject is good at 'o' vowels while the second is good at 'e' vowels). Also, there are only slight di erences to the predicted articulatory positions of the hearing impaired compared to the target. However, the sounds produce by the hearing impaired have a nasal quality, which is not captured by the model due to lack of data. 2014-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/4636 Master's Theses English Animo Repository
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
description Visual articulatory feedback (VAF) are systems that provide visual representations of the user's articulations as feedback and have been shown to help in second language learning and speech therapy for people with hearing impairment. However, one of their current limitations is they do not give feedback on how to correct articulations. This can be overcome by showing which articulators are being used vis- a-vis the correct articulators through acoustic-to-articulatory inversion. Researches on solving this problem have been continuing for the last forty years because of its one-to-many nature and its non-linearity. However, most of these are not easily applicable to VAFs and are focused on sounds made by non-hearing impaired people. By using a combination of Self-Organizing Maps (SOMs) and regression models, this research aims to solve the acoustic-to-articulatory inversion problem as input to the creation of a VAF for the hearing impaired. The models are created using acoustic and articulatory data from the MOCHA-TIMIT database. Acoustic data are represented using Mel Frequency Cepstral Coe - cients (MFCCs), while articulatory data are represented using Cartesian coordinates of the di erent articulators. Video and audio data from people with hearing impairment are also collected for testing. Using the models created, the values of the articulatory parameters from the audio data are derived. Aside from that, speci c articulators that should be adjusted to produce the target sound will also be provided. Visualization was done by plotting the Cartesian coordinates of the articulatory features and overlaying a side view of the vocal tract to it. Results showed that the inversion methodology explained here does not improve significant improvement to existing methods. People with hearing impairment have di erent strengths and weaknesses in regards to the sounds they pronounce (the rst subject is good at 'o' vowels while the second is good at 'e' vowels). Also, there are only slight di erences to the predicted articulatory positions of the hearing impaired compared to the target. However, the sounds produce by the hearing impaired have a nasal quality, which is not captured by the model due to lack of data.
format text
author Agustin, Natalie S.
spellingShingle Agustin, Natalie S.
Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
author_facet Agustin, Natalie S.
author_sort Agustin, Natalie S.
title Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_short Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_full Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_fullStr Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_full_unstemmed Using self-organizing maps and regression to solve the acoustic-to-articulatory Inversion as input to a visual articulatory feedback system
title_sort using self-organizing maps and regression to solve the acoustic-to-articulatory inversion as input to a visual articulatory feedback system
publisher Animo Repository
publishDate 2014
url https://animorepository.dlsu.edu.ph/etd_masteral/4636
_version_ 1736864192767459328