Lasso environment model combination for robust speech recognition

In this paper, we propose a novel acoustic model adaptation method for noise robust speech recognition. Model combination is a common way to adapt acoustic models to a target test environment. For example, the mean supervectors of the adapted model are obtained as a linear combination of mean superv...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiao, Xiong, Li, Jinyu, Chng, Eng Siong, Li, Haizhou
Other Authors:	School of Computer Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/98809 http://hdl.handle.net/10220/13389
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-98809
record_format	dspace
spelling	sg-ntu-dr.10356-988092020-05-28T07:18:04Z Lasso environment model combination for robust speech recognition Xiao, Xiong Li, Jinyu Chng, Eng Siong Li, Haizhou School of Computer Engineering IEEE International Conference on Acoustics, Speech and Signal Processing (2012 : Kyoto, Japan) Temasek Laboratories DRNTU::Engineering::Computer science and engineering In this paper, we propose a novel acoustic model adaptation method for noise robust speech recognition. Model combination is a common way to adapt acoustic models to a target test environment. For example, the mean supervectors of the adapted model are obtained as a linear combination of mean supervectors of many pre-trained environment-dependent acoustic models. Usually, the combination weights are estimated using a maximum likelihood (ML) criterion and the weights are nonzero for all the mean supervectors. We propose to estimate the weights by using Lasso (least absolute shrinkage and selection operator) which imposes an L1 regularization term in the weight estimation problem to shrink some weights to exactly zero. Our study shows that Lasso usually shrinks to zero the weights of those mean supervectors not relevant to the test environment. By removing some nonrelevant supervectors, the obtained mean supervectors are found to be more robust against noise distortions. Experimental results on Aurora-2 task show that the Lasso-based mean combination consistently outperforms ML-based combination. 2013-09-09T06:34:03Z 2019-12-06T19:59:51Z 2013-09-09T06:34:03Z 2019-12-06T19:59:51Z 2012 2012 Conference Paper Xiao, X., Li, J., Chng, E. S., & Li, H. (2012). Lasso environment model combination for robust speech recognition. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4305-4308. https://hdl.handle.net/10356/98809 http://hdl.handle.net/10220/13389 10.1109/ICASSP.2012.6288871 en © 2012 IEEE.
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Xiao, Xiong Li, Jinyu Chng, Eng Siong Li, Haizhou Lasso environment model combination for robust speech recognition
description	In this paper, we propose a novel acoustic model adaptation method for noise robust speech recognition. Model combination is a common way to adapt acoustic models to a target test environment. For example, the mean supervectors of the adapted model are obtained as a linear combination of mean supervectors of many pre-trained environment-dependent acoustic models. Usually, the combination weights are estimated using a maximum likelihood (ML) criterion and the weights are nonzero for all the mean supervectors. We propose to estimate the weights by using Lasso (least absolute shrinkage and selection operator) which imposes an L1 regularization term in the weight estimation problem to shrink some weights to exactly zero. Our study shows that Lasso usually shrinks to zero the weights of those mean supervectors not relevant to the test environment. By removing some nonrelevant supervectors, the obtained mean supervectors are found to be more robust against noise distortions. Experimental results on Aurora-2 task show that the Lasso-based mean combination consistently outperforms ML-based combination.
author2	School of Computer Engineering
author_facet	School of Computer Engineering Xiao, Xiong Li, Jinyu Chng, Eng Siong Li, Haizhou
format	Conference or Workshop Item
author	Xiao, Xiong Li, Jinyu Chng, Eng Siong Li, Haizhou
author_sort	Xiao, Xiong
title	Lasso environment model combination for robust speech recognition
title_short	Lasso environment model combination for robust speech recognition
title_full	Lasso environment model combination for robust speech recognition
title_fullStr	Lasso environment model combination for robust speech recognition
title_full_unstemmed	Lasso environment model combination for robust speech recognition
title_sort	lasso environment model combination for robust speech recognition
publishDate	2013
url	https://hdl.handle.net/10356/98809 http://hdl.handle.net/10220/13389
_version_	1681059616335593472

Lasso environment model combination for robust speech recognition

Similar Items