The NNi Vietnamese speech recognition system for mediaeval 2016

This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence tr...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiao, Xiong, Nwe, Tin Lay, Chng, Eng Siong, Ma, Bin, Li, Haizhou, Wang, Lei, Ni, Chongjia, Leung, Cheung-Chi, You, Changhuai, Xie, Lei, Xu, Haihua
Other Authors:	School of Computer Science and Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2019
Subjects:	Vietnamese Recognition DRNTU::Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/79851 http://hdl.handle.net/10220/48316 http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_52.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-79851
record_format	dspace
spelling	sg-ntu-dr.10356-798512019-12-06T13:35:20Z The NNi Vietnamese speech recognition system for mediaeval 2016 Xiao, Xiong Nwe, Tin Lay Chng, Eng Siong Ma, Bin Li, Haizhou Wang, Lei Ni, Chongjia Leung, Cheung-Chi You, Changhuai Xie, Lei Xu, Haihua School of Computer Science and Engineering Multimedia Benchmark Workshop Vietnamese Recognition DRNTU::Engineering::Computer science and engineering This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence training, etc. Besides the acoustic modeling techniques, speech data augmentation was also examined to develop a more robust acoustic model. The I2R team collected a number of text resources from the Internet and made them available to other participants in the task. The web text crawled from the Internet was used to train a 5-gram language model. The submitted system obtained the token error rate (TER) of 15.1, 23.0 and 50.5 on Devel local set, Devel set and Test set, respectively. Published version 2019-05-22T05:09:30Z 2019-12-06T13:35:20Z 2019-05-22T05:09:30Z 2019-12-06T13:35:20Z 2016 Conference Paper Wang, L., Ni, C., Leung, C. -C., You, C., Xie, L., Xu, H., . . . Li, H. (2016). The NNi Vietnamese speech recognition system for mediaeval 2016. Multimedia Benchmark Workshop, 1739. https://hdl.handle.net/10356/79851 http://hdl.handle.net/10220/48316 http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_52.pdf en © 2016 The Author(s). 3 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	Vietnamese Recognition DRNTU::Engineering::Computer science and engineering
spellingShingle	Vietnamese Recognition DRNTU::Engineering::Computer science and engineering Xiao, Xiong Nwe, Tin Lay Chng, Eng Siong Ma, Bin Li, Haizhou Wang, Lei Ni, Chongjia Leung, Cheung-Chi You, Changhuai Xie, Lei Xu, Haihua The NNi Vietnamese speech recognition system for mediaeval 2016
description	This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence training, etc. Besides the acoustic modeling techniques, speech data augmentation was also examined to develop a more robust acoustic model. The I2R team collected a number of text resources from the Internet and made them available to other participants in the task. The web text crawled from the Internet was used to train a 5-gram language model. The submitted system obtained the token error rate (TER) of 15.1, 23.0 and 50.5 on Devel local set, Devel set and Test set, respectively.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Xiao, Xiong Nwe, Tin Lay Chng, Eng Siong Ma, Bin Li, Haizhou Wang, Lei Ni, Chongjia Leung, Cheung-Chi You, Changhuai Xie, Lei Xu, Haihua
format	Conference or Workshop Item
author	Xiao, Xiong Nwe, Tin Lay Chng, Eng Siong Ma, Bin Li, Haizhou Wang, Lei Ni, Chongjia Leung, Cheung-Chi You, Changhuai Xie, Lei Xu, Haihua
author_sort	Xiao, Xiong
title	The NNi Vietnamese speech recognition system for mediaeval 2016
title_short	The NNi Vietnamese speech recognition system for mediaeval 2016
title_full	The NNi Vietnamese speech recognition system for mediaeval 2016
title_fullStr	The NNi Vietnamese speech recognition system for mediaeval 2016
title_full_unstemmed	The NNi Vietnamese speech recognition system for mediaeval 2016
title_sort	nni vietnamese speech recognition system for mediaeval 2016
publishDate	2019
url	https://hdl.handle.net/10356/79851 http://hdl.handle.net/10220/48316 http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_52.pdf
_version_	1681049018075971584

The NNi Vietnamese speech recognition system for mediaeval 2016

Similar Items