Cross match-CHMM fusion for speaker adaptation of voice biometric

The most significant factor affecting automatic voice biometric performance is the variation in the signal characteristics, due to speaker-based variability, conversation-based variability and technology variability. These variations give great challenge in accurately modeling and verifying a speake...

Full description

Saved in:
Bibliographic Details
Main Authors: Ariff, A. K., Salleh, S. H., Kamarulafizam, I., Noor, A. M.
Format: Article
Language:English
Published: Asian Research Publishing Network 2017
Subjects:
Online Access:http://eprints.utm.my/id/eprint/75405/1/AKAriff_%20CrossMatch-CHMMFusionforSpeaker.pdf
http://eprints.utm.my/id/eprint/75405/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85011022874&partnerID=40&md5=22523e4b20d5402fada1bb079d47c797
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
id my.utm.75405
record_format eprints
spelling my.utm.754052018-03-22T11:07:36Z http://eprints.utm.my/id/eprint/75405/ Cross match-CHMM fusion for speaker adaptation of voice biometric Ariff, A. K. Salleh, S. H. Kamarulafizam, I. Noor, A. M. QH Natural history The most significant factor affecting automatic voice biometric performance is the variation in the signal characteristics, due to speaker-based variability, conversation-based variability and technology variability. These variations give great challenge in accurately modeling and verifying a speaker. To solve this variability effects, the cross match (CM) technique is proposed to provide a speaker model that can adapt to variability over periods of time. Using limited amount of enrollment utterances, a client barcode is generated and can be updated by cross matching the client barcode with new data. Furthermore, CM adds the dimension of multimodality at the fusion-level when the similarity score from CM can be fused with the score from the default speaker modeling. The scores need to be normalized before the fusion takes place. By fusing the CM with continuous Hidden Markov Model (CHMM), the new adapted model gave significant improvement in identification and verification task, where the equal error rate (EER) decreased from 6.51% to 1.23% in speaker identification and from 5.87% to 1.04% in speaker verification. EER also decreased over time (across five sessions) when the CM is applied. The best combination of normalization and fusion technique methods is piecewise-linear method and weighted sum. Asian Research Publishing Network 2017 Article PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/75405/1/AKAriff_%20CrossMatch-CHMMFusionforSpeaker.pdf Ariff, A. K. and Salleh, S. H. and Kamarulafizam, I. and Noor, A. M. (2017) Cross match-CHMM fusion for speaker adaptation of voice biometric. ARPN Journal of Engineering and Applied Sciences, 12 (2). pp. 428-434. ISSN 1819-6608 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85011022874&partnerID=40&md5=22523e4b20d5402fada1bb079d47c797
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QH Natural history
spellingShingle QH Natural history
Ariff, A. K.
Salleh, S. H.
Kamarulafizam, I.
Noor, A. M.
Cross match-CHMM fusion for speaker adaptation of voice biometric
description The most significant factor affecting automatic voice biometric performance is the variation in the signal characteristics, due to speaker-based variability, conversation-based variability and technology variability. These variations give great challenge in accurately modeling and verifying a speaker. To solve this variability effects, the cross match (CM) technique is proposed to provide a speaker model that can adapt to variability over periods of time. Using limited amount of enrollment utterances, a client barcode is generated and can be updated by cross matching the client barcode with new data. Furthermore, CM adds the dimension of multimodality at the fusion-level when the similarity score from CM can be fused with the score from the default speaker modeling. The scores need to be normalized before the fusion takes place. By fusing the CM with continuous Hidden Markov Model (CHMM), the new adapted model gave significant improvement in identification and verification task, where the equal error rate (EER) decreased from 6.51% to 1.23% in speaker identification and from 5.87% to 1.04% in speaker verification. EER also decreased over time (across five sessions) when the CM is applied. The best combination of normalization and fusion technique methods is piecewise-linear method and weighted sum.
format Article
author Ariff, A. K.
Salleh, S. H.
Kamarulafizam, I.
Noor, A. M.
author_facet Ariff, A. K.
Salleh, S. H.
Kamarulafizam, I.
Noor, A. M.
author_sort Ariff, A. K.
title Cross match-CHMM fusion for speaker adaptation of voice biometric
title_short Cross match-CHMM fusion for speaker adaptation of voice biometric
title_full Cross match-CHMM fusion for speaker adaptation of voice biometric
title_fullStr Cross match-CHMM fusion for speaker adaptation of voice biometric
title_full_unstemmed Cross match-CHMM fusion for speaker adaptation of voice biometric
title_sort cross match-chmm fusion for speaker adaptation of voice biometric
publisher Asian Research Publishing Network
publishDate 2017
url http://eprints.utm.my/id/eprint/75405/1/AKAriff_%20CrossMatch-CHMMFusionforSpeaker.pdf
http://eprints.utm.my/id/eprint/75405/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85011022874&partnerID=40&md5=22523e4b20d5402fada1bb079d47c797
_version_ 1643657054665048064