Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules

This thesis focuses on the robustness issues of speaker veriﬁcation (SV) systems. Al-though current SV systems perform well under clean condition, their performance de-grades dramatically under real-world uncontrolled environments. The reliability of cur-rent SV systems is also questionable under sp...

Full description

Saved in:

Bibliographic Details
Main Author:	Du, Steven
Other Authors:	Chng Eng Siong
Format:	Theses and Dissertations
Language:	English
Published:	2015
Subjects:	DRNTU::Engineering::General
Online Access:	https://hdl.handle.net/10356/65396
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-65396
record_format	dspace
spelling	sg-ntu-dr.10356-653962023-03-04T00:38:20Z Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules Du, Steven Chng Eng Siong School of Computer Engineering Emerging Research Lab DRNTU::Engineering::General This thesis focuses on the robustness issues of speaker veriﬁcation (SV) systems. Al-though current SV systems perform well under clean condition, their performance de-grades dramatically under real-world uncontrolled environments. The reliability of cur-rent SV systems is also questionable under spooﬁng attacks. These pitfalls severely limit it’s deployment in many applications. This thesis presents approaches to combat these two robustness issues, namely noise robustness and spooﬁng attacks. To address the noise robustness issue, the use of deep neural networks (DNN) as a feature compensation method in the front-end module of the SV system is proposed. The motivation to use DNN is due to its success in various related speech ﬁelds, and its ability to model nonlinear relationships between high dimensional input and output. In this work, DNN is used to convert noisy input features into clean features. The proposed method is evaluated using the benchmarking speaker recognition evaluation (SRE) 2010 dataset provided by the National Institute of Standards and Technology(NIST). To focus on the eﬀect of feature pre-processing, the SV system is trained using noise free speech and evaluated on noise corrupted speech. Results show that the proposed DNN feature compensation improves the equal error rate (EER) by 2%-25% under diﬀerent unseen noise types for various SNR levels. To address the spooﬁng attacks issue, the use of long temporal high dimensional speech features derived from both magnitude and phase spectra as input features to neural network (NN) classiﬁers is proposed. The long term temporal information is in-corporated by concatenating 31 successive frames as input feature to the NN classiﬁer. The classiﬁer is then used to predict the posterior probability of the test speech being spooﬁng speech. Four speakers of CMU-ARCTIC database are selected for spooﬁng data generation and methods evaluation. Spooﬁng data is generated by four synthesis meth-ods, namely: AHOcoder, STRAIGHT, JD-GMM with maximum likelihood parameter generation, and weighted correlation-based frequency warping (CFW). The results show that both long term information and detailed information maintained in high dimen-sional features improve the performance of synthetic speech detection signiﬁcantly. The proposed method was extended and used to compete in the ASVspoof 2015 challenge and achieved best results in the closed set challenge among 16 teams worldwide. MASTER OF ENGINEERING (SCE) 2015-09-08T05:04:48Z 2015-09-08T05:04:48Z 2015 2015 Thesis Du, S. (2015). Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/65396 10.32657/10356/65396 en 81 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::General
spellingShingle	DRNTU::Engineering::General Du, Steven Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules
description	This thesis focuses on the robustness issues of speaker veriﬁcation (SV) systems. Al-though current SV systems perform well under clean condition, their performance de-grades dramatically under real-world uncontrolled environments. The reliability of cur-rent SV systems is also questionable under spooﬁng attacks. These pitfalls severely limit it’s deployment in many applications. This thesis presents approaches to combat these two robustness issues, namely noise robustness and spooﬁng attacks. To address the noise robustness issue, the use of deep neural networks (DNN) as a feature compensation method in the front-end module of the SV system is proposed. The motivation to use DNN is due to its success in various related speech ﬁelds, and its ability to model nonlinear relationships between high dimensional input and output. In this work, DNN is used to convert noisy input features into clean features. The proposed method is evaluated using the benchmarking speaker recognition evaluation (SRE) 2010 dataset provided by the National Institute of Standards and Technology(NIST). To focus on the eﬀect of feature pre-processing, the SV system is trained using noise free speech and evaluated on noise corrupted speech. Results show that the proposed DNN feature compensation improves the equal error rate (EER) by 2%-25% under diﬀerent unseen noise types for various SNR levels. To address the spooﬁng attacks issue, the use of long temporal high dimensional speech features derived from both magnitude and phase spectra as input features to neural network (NN) classiﬁers is proposed. The long term temporal information is in-corporated by concatenating 31 successive frames as input feature to the NN classiﬁer. The classiﬁer is then used to predict the posterior probability of the test speech being spooﬁng speech. Four speakers of CMU-ARCTIC database are selected for spooﬁng data generation and methods evaluation. Spooﬁng data is generated by four synthesis meth-ods, namely: AHOcoder, STRAIGHT, JD-GMM with maximum likelihood parameter generation, and weighted correlation-based frequency warping (CFW). The results show that both long term information and detailed information maintained in high dimen-sional features improve the performance of synthetic speech detection signiﬁcantly. The proposed method was extended and used to compete in the ASVspoof 2015 challenge and achieved best results in the closed set challenge among 16 teams worldwide.
author2	Chng Eng Siong
author_facet	Chng Eng Siong Du, Steven
format	Theses and Dissertations
author	Du, Steven
author_sort	Du, Steven
title	Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules
title_short	Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules
title_full	Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules
title_fullStr	Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules
title_full_unstemmed	Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules
title_sort	robust speaker verification system with anti-spoofing detection and dnn feature enhancement modules
publishDate	2015
url	https://hdl.handle.net/10356/65396
_version_	1759854838745661440

Robust speaker verification system with anti-spoofing detection and DNN feature enhancement modules

Similar Items