Combined articulatory and auditory processing for improved speech recognition

In this paper, we examined the feasibility of articulatory phonetic inversion (API) conditioned on the auditory qualities for improved speech recognition. And we introduced an efficient data-driven heuristic learning algorithm to capture the articulatory-phonetic features (APFs) of English speech. T...

Full description

Saved in:
Bibliographic Details
Main Authors: Huang, Guangpu, Er, Meng Joo
Other Authors: School of Electrical and Electronic Engineering
Format: Conference or Workshop Item
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/98873
http://hdl.handle.net/10220/12782
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-98873
record_format dspace
spelling sg-ntu-dr.10356-988732020-03-07T13:24:49Z Combined articulatory and auditory processing for improved speech recognition Huang, Guangpu Er, Meng Joo School of Electrical and Electronic Engineering IEEE Conference on Industrial Electronics and Applications (7th : 2012 : Singapore) DRNTU::Engineering::Electrical and electronic engineering In this paper, we examined the feasibility of articulatory phonetic inversion (API) conditioned on the auditory qualities for improved speech recognition. And we introduced an efficient data-driven heuristic learning algorithm to capture the articulatory-phonetic features (APFs) of English speech. Then we reported the performance of the combined auditory and articulatory processing methods in the inversion and recognition experiments. Firstly, at the front end, the auditory based bark-frequency cepstral coefficient (BFCC) obtained equivalent or higher accuracy compared to the mel-frequency cepstral coefficient (MFCC). Secondly, the use of APFs also significantly altered the phoneme error patterns compared to the purely acoustic features, and they displayed advantages over the canonical pseudo-articulatory features (PAFs) which are manually derived from the phonological rules. The observations support our view that the combinational use of auditory and articulatory cues is beneficial for speech pattern classification. And the proposed neural based API model qualifies as a competitive candidate for profound phoneme recognition with salient features such as generality and portability. 2013-08-01T04:25:21Z 2019-12-06T20:00:41Z 2013-08-01T04:25:21Z 2019-12-06T20:00:41Z 2011 2011 Conference Paper https://hdl.handle.net/10356/98873 http://hdl.handle.net/10220/12782 10.1109/ICIEA.2012.6360864 en
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Huang, Guangpu
Er, Meng Joo
Combined articulatory and auditory processing for improved speech recognition
description In this paper, we examined the feasibility of articulatory phonetic inversion (API) conditioned on the auditory qualities for improved speech recognition. And we introduced an efficient data-driven heuristic learning algorithm to capture the articulatory-phonetic features (APFs) of English speech. Then we reported the performance of the combined auditory and articulatory processing methods in the inversion and recognition experiments. Firstly, at the front end, the auditory based bark-frequency cepstral coefficient (BFCC) obtained equivalent or higher accuracy compared to the mel-frequency cepstral coefficient (MFCC). Secondly, the use of APFs also significantly altered the phoneme error patterns compared to the purely acoustic features, and they displayed advantages over the canonical pseudo-articulatory features (PAFs) which are manually derived from the phonological rules. The observations support our view that the combinational use of auditory and articulatory cues is beneficial for speech pattern classification. And the proposed neural based API model qualifies as a competitive candidate for profound phoneme recognition with salient features such as generality and portability.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Huang, Guangpu
Er, Meng Joo
format Conference or Workshop Item
author Huang, Guangpu
Er, Meng Joo
author_sort Huang, Guangpu
title Combined articulatory and auditory processing for improved speech recognition
title_short Combined articulatory and auditory processing for improved speech recognition
title_full Combined articulatory and auditory processing for improved speech recognition
title_fullStr Combined articulatory and auditory processing for improved speech recognition
title_full_unstemmed Combined articulatory and auditory processing for improved speech recognition
title_sort combined articulatory and auditory processing for improved speech recognition
publishDate 2013
url https://hdl.handle.net/10356/98873
http://hdl.handle.net/10220/12782
_version_ 1681041741528956928