Cross match-CHMM fusion for speaker adaptation of voice biometric
The most significant factor affecting automatic voice biometric performance is the variation in the signal characteristics, due to speaker-based variability, conversation-based variability and technology variability. These variations give great challenge in accurately modeling and verifying a speake...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Asian Research Publishing Network
2017
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/75405/1/AKAriff_%20CrossMatch-CHMMFusionforSpeaker.pdf http://eprints.utm.my/id/eprint/75405/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85011022874&partnerID=40&md5=22523e4b20d5402fada1bb079d47c797 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Malaysia |
Language: | English |
id |
my.utm.75405 |
---|---|
record_format |
eprints |
spelling |
my.utm.754052018-03-22T11:07:36Z http://eprints.utm.my/id/eprint/75405/ Cross match-CHMM fusion for speaker adaptation of voice biometric Ariff, A. K. Salleh, S. H. Kamarulafizam, I. Noor, A. M. QH Natural history The most significant factor affecting automatic voice biometric performance is the variation in the signal characteristics, due to speaker-based variability, conversation-based variability and technology variability. These variations give great challenge in accurately modeling and verifying a speaker. To solve this variability effects, the cross match (CM) technique is proposed to provide a speaker model that can adapt to variability over periods of time. Using limited amount of enrollment utterances, a client barcode is generated and can be updated by cross matching the client barcode with new data. Furthermore, CM adds the dimension of multimodality at the fusion-level when the similarity score from CM can be fused with the score from the default speaker modeling. The scores need to be normalized before the fusion takes place. By fusing the CM with continuous Hidden Markov Model (CHMM), the new adapted model gave significant improvement in identification and verification task, where the equal error rate (EER) decreased from 6.51% to 1.23% in speaker identification and from 5.87% to 1.04% in speaker verification. EER also decreased over time (across five sessions) when the CM is applied. The best combination of normalization and fusion technique methods is piecewise-linear method and weighted sum. Asian Research Publishing Network 2017 Article PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/75405/1/AKAriff_%20CrossMatch-CHMMFusionforSpeaker.pdf Ariff, A. K. and Salleh, S. H. and Kamarulafizam, I. and Noor, A. M. (2017) Cross match-CHMM fusion for speaker adaptation of voice biometric. ARPN Journal of Engineering and Applied Sciences, 12 (2). pp. 428-434. ISSN 1819-6608 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85011022874&partnerID=40&md5=22523e4b20d5402fada1bb079d47c797 |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
language |
English |
topic |
QH Natural history |
spellingShingle |
QH Natural history Ariff, A. K. Salleh, S. H. Kamarulafizam, I. Noor, A. M. Cross match-CHMM fusion for speaker adaptation of voice biometric |
description |
The most significant factor affecting automatic voice biometric performance is the variation in the signal characteristics, due to speaker-based variability, conversation-based variability and technology variability. These variations give great challenge in accurately modeling and verifying a speaker. To solve this variability effects, the cross match (CM) technique is proposed to provide a speaker model that can adapt to variability over periods of time. Using limited amount of enrollment utterances, a client barcode is generated and can be updated by cross matching the client barcode with new data. Furthermore, CM adds the dimension of multimodality at the fusion-level when the similarity score from CM can be fused with the score from the default speaker modeling. The scores need to be normalized before the fusion takes place. By fusing the CM with continuous Hidden Markov Model (CHMM), the new adapted model gave significant improvement in identification and verification task, where the equal error rate (EER) decreased from 6.51% to 1.23% in speaker identification and from 5.87% to 1.04% in speaker verification. EER also decreased over time (across five sessions) when the CM is applied. The best combination of normalization and fusion technique methods is piecewise-linear method and weighted sum. |
format |
Article |
author |
Ariff, A. K. Salleh, S. H. Kamarulafizam, I. Noor, A. M. |
author_facet |
Ariff, A. K. Salleh, S. H. Kamarulafizam, I. Noor, A. M. |
author_sort |
Ariff, A. K. |
title |
Cross match-CHMM fusion for speaker adaptation of voice biometric |
title_short |
Cross match-CHMM fusion for speaker adaptation of voice biometric |
title_full |
Cross match-CHMM fusion for speaker adaptation of voice biometric |
title_fullStr |
Cross match-CHMM fusion for speaker adaptation of voice biometric |
title_full_unstemmed |
Cross match-CHMM fusion for speaker adaptation of voice biometric |
title_sort |
cross match-chmm fusion for speaker adaptation of voice biometric |
publisher |
Asian Research Publishing Network |
publishDate |
2017 |
url |
http://eprints.utm.my/id/eprint/75405/1/AKAriff_%20CrossMatch-CHMMFusionforSpeaker.pdf http://eprints.utm.my/id/eprint/75405/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85011022874&partnerID=40&md5=22523e4b20d5402fada1bb079d47c797 |
_version_ |
1643657054665048064 |