Psychoacoustic model for robust speech recognition

This thesis presents a detailed study on psychoacoustic modeling for feature extraction for robust speech recognition. In an automatic speech recognition (ASR) system, feature extraction is critical to determining the recognizer's performance. The most popular feature vectors for ASR are Mel Fr...

Full description

Saved in:
Bibliographic Details
Main Author: Luo, Xue Wen
Other Authors: Soon Ing Yann
Format: Theses and Dissertations
Language:English
Published: 2010
Subjects:
Online Access:https://hdl.handle.net/10356/41749
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-41749
record_format dspace
spelling sg-ntu-dr.10356-417492023-07-04T17:05:46Z Psychoacoustic model for robust speech recognition Luo, Xue Wen Soon Ing Yann School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing This thesis presents a detailed study on psychoacoustic modeling for feature extraction for robust speech recognition. In an automatic speech recognition (ASR) system, feature extraction is critical to determining the recognizer's performance. The most popular feature vectors for ASR are Mel Frequency Cepstral Coefficients (MFCC). However, it is also well known that its performance drops dramatically under noisy condition. One of the objectives of this thesis is to improve the robustness of a recognizer. Compared to an ASR system, human is good at tolerating background noise, hence psychoacoustic modeling of human hearing system is investigated and integrated into speech features extraction process of a speech recognizer to increase the robustness of it. MASTER OF ENGINEERING (EEE) 2010-08-06T07:21:23Z 2010-08-06T07:21:23Z 2008 2008 Thesis Luo, X. W. (2008). Psychoacoustic model for robust speech recognition. Master’s thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/41749 10.32657/10356/41749 en 108 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing
spellingShingle DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing
Luo, Xue Wen
Psychoacoustic model for robust speech recognition
description This thesis presents a detailed study on psychoacoustic modeling for feature extraction for robust speech recognition. In an automatic speech recognition (ASR) system, feature extraction is critical to determining the recognizer's performance. The most popular feature vectors for ASR are Mel Frequency Cepstral Coefficients (MFCC). However, it is also well known that its performance drops dramatically under noisy condition. One of the objectives of this thesis is to improve the robustness of a recognizer. Compared to an ASR system, human is good at tolerating background noise, hence psychoacoustic modeling of human hearing system is investigated and integrated into speech features extraction process of a speech recognizer to increase the robustness of it.
author2 Soon Ing Yann
author_facet Soon Ing Yann
Luo, Xue Wen
format Theses and Dissertations
author Luo, Xue Wen
author_sort Luo, Xue Wen
title Psychoacoustic model for robust speech recognition
title_short Psychoacoustic model for robust speech recognition
title_full Psychoacoustic model for robust speech recognition
title_fullStr Psychoacoustic model for robust speech recognition
title_full_unstemmed Psychoacoustic model for robust speech recognition
title_sort psychoacoustic model for robust speech recognition
publishDate 2010
url https://hdl.handle.net/10356/41749
_version_ 1772826121824370688