Robust text-independent speaker verification in environmental noise

Automatic speaker verification has many potential applications in security, surveillance and access control. In many of these applications, it is necessary to verify the speaker based on a short and noise degraded speech utterance. This thesis addresses the problem of robust speaker verification in...

Full description

Saved in:
Bibliographic Details
Main Author: Panda, Ashish
Other Authors: Thambipillai Srikanthan
Format: Theses and Dissertations
Language:English
Published: 2011
Subjects:
Online Access:https://hdl.handle.net/10356/46290
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-46290
record_format dspace
spelling sg-ntu-dr.10356-462902023-03-04T00:47:55Z Robust text-independent speaker verification in environmental noise Panda, Ashish Thambipillai Srikanthan School of Computer Engineering Centre for High Performance Embedded Systems DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Biometrics Automatic speaker verification has many potential applications in security, surveillance and access control. In many of these applications, it is necessary to verify the speaker based on a short and noise degraded speech utterance. This thesis addresses the problem of robust speaker verification in environmental noise conditions by introducing novel and computationally efficient techniques that are suitable for realistic conditions. It also engenders the application of psychoacoustics to realize an adaptive model compensation technique. The probabilistic spectral subtraction (PSS) technique was investigated in detail and subsequently extended to accommodate noisy training utterance through a novel training scheme. The proposed training scheme has been shown to reduce the equal error rate, on an average, by 20% over the conventional procedure. The parallel model combination technique was investigated next due to its inherent compute efficient properties. While this provided further reduction in the equal error rate when compared to the PSS, it fell short in terms of inaccurate noise corruption function and its reliance on accurate noise estimation for better performance. To address the issue of inaccurate noise corruption function, the max function, a non-linear function, was evaluated as an alternate noise corruption function. This led to the development of a new generalized compensation scheme in order to efficiently estimate the transformed model parameters for non-linear noise corruption functions. Experimental evaluations demonstrate that the proposed max function based compensation scheme is capable of providing better performance gain in white noise conditions. In addition, it was demonstrated that the additive function provides better performance in pink noise conditions. In order to overcome the limitation that neither max function nor additive function can perform effectively across different types of noise, a novel psychoacoustic noise corruption function is proposed by exploiting masking properties of noise and speech signals. The psychoacoustic noise corruption function and the generalized compensation scheme were then elegantly combined to propose a psychoacoustic model compensation technique, which is capable of effective performance across different types of noise. Experimental evaluations of the proposed psychoacoustic model compensation technique conclusively demonstrate that it provides superior performance in both white and pink noise conditions, outperforming parallel model combination by 36% and max function based model compensation by 24%. A new multi-conditioning approach, based on the psychoacoustic model compensation, has also been proposed to deal with realistic and complex noise conditions. DOCTOR OF PHILOSOPHY (SCE) 2011-11-28T08:53:42Z 2011-11-28T08:53:42Z 2011 2011 Thesis Panda, A. (2011). Robust text-independent speaker verification in environmental noise. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/46290 10.32657/10356/46290 en 137 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Biometrics
spellingShingle DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Biometrics
Panda, Ashish
Robust text-independent speaker verification in environmental noise
description Automatic speaker verification has many potential applications in security, surveillance and access control. In many of these applications, it is necessary to verify the speaker based on a short and noise degraded speech utterance. This thesis addresses the problem of robust speaker verification in environmental noise conditions by introducing novel and computationally efficient techniques that are suitable for realistic conditions. It also engenders the application of psychoacoustics to realize an adaptive model compensation technique. The probabilistic spectral subtraction (PSS) technique was investigated in detail and subsequently extended to accommodate noisy training utterance through a novel training scheme. The proposed training scheme has been shown to reduce the equal error rate, on an average, by 20% over the conventional procedure. The parallel model combination technique was investigated next due to its inherent compute efficient properties. While this provided further reduction in the equal error rate when compared to the PSS, it fell short in terms of inaccurate noise corruption function and its reliance on accurate noise estimation for better performance. To address the issue of inaccurate noise corruption function, the max function, a non-linear function, was evaluated as an alternate noise corruption function. This led to the development of a new generalized compensation scheme in order to efficiently estimate the transformed model parameters for non-linear noise corruption functions. Experimental evaluations demonstrate that the proposed max function based compensation scheme is capable of providing better performance gain in white noise conditions. In addition, it was demonstrated that the additive function provides better performance in pink noise conditions. In order to overcome the limitation that neither max function nor additive function can perform effectively across different types of noise, a novel psychoacoustic noise corruption function is proposed by exploiting masking properties of noise and speech signals. The psychoacoustic noise corruption function and the generalized compensation scheme were then elegantly combined to propose a psychoacoustic model compensation technique, which is capable of effective performance across different types of noise. Experimental evaluations of the proposed psychoacoustic model compensation technique conclusively demonstrate that it provides superior performance in both white and pink noise conditions, outperforming parallel model combination by 36% and max function based model compensation by 24%. A new multi-conditioning approach, based on the psychoacoustic model compensation, has also been proposed to deal with realistic and complex noise conditions.
author2 Thambipillai Srikanthan
author_facet Thambipillai Srikanthan
Panda, Ashish
format Theses and Dissertations
author Panda, Ashish
author_sort Panda, Ashish
title Robust text-independent speaker verification in environmental noise
title_short Robust text-independent speaker verification in environmental noise
title_full Robust text-independent speaker verification in environmental noise
title_fullStr Robust text-independent speaker verification in environmental noise
title_full_unstemmed Robust text-independent speaker verification in environmental noise
title_sort robust text-independent speaker verification in environmental noise
publishDate 2011
url https://hdl.handle.net/10356/46290
_version_ 1759857950484070400