The NNi Vietnamese speech recognition system for mediaeval 2016

This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence tr...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiao, Xiong, Nwe, Tin Lay, Chng, Eng Siong, Ma, Bin, Li, Haizhou, Wang, Lei, Ni, Chongjia, Leung, Cheung-Chi, You, Changhuai, Xie, Lei, Xu, Haihua
Other Authors: School of Computer Science and Engineering
Format: Conference or Workshop Item
Language:English
Published: 2019
Subjects:
Online Access:https://hdl.handle.net/10356/79851
http://hdl.handle.net/10220/48316
http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_52.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-79851
record_format dspace
spelling sg-ntu-dr.10356-798512019-12-06T13:35:20Z The NNi Vietnamese speech recognition system for mediaeval 2016 Xiao, Xiong Nwe, Tin Lay Chng, Eng Siong Ma, Bin Li, Haizhou Wang, Lei Ni, Chongjia Leung, Cheung-Chi You, Changhuai Xie, Lei Xu, Haihua School of Computer Science and Engineering Multimedia Benchmark Workshop Vietnamese Recognition DRNTU::Engineering::Computer science and engineering This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence training, etc. Besides the acoustic modeling techniques, speech data augmentation was also examined to develop a more robust acoustic model. The I2R team collected a number of text resources from the Internet and made them available to other participants in the task. The web text crawled from the Internet was used to train a 5-gram language model. The submitted system obtained the token error rate (TER) of 15.1, 23.0 and 50.5 on Devel local set, Devel set and Test set, respectively. Published version 2019-05-22T05:09:30Z 2019-12-06T13:35:20Z 2019-05-22T05:09:30Z 2019-12-06T13:35:20Z 2016 Conference Paper Wang, L., Ni, C., Leung, C. -C., You, C., Xie, L., Xu, H., . . . Li, H. (2016). The NNi Vietnamese speech recognition system for mediaeval 2016. Multimedia Benchmark Workshop, 1739. https://hdl.handle.net/10356/79851 http://hdl.handle.net/10220/48316 http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_52.pdf en © 2016 The Author(s). 3 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Vietnamese
Recognition
DRNTU::Engineering::Computer science and engineering
spellingShingle Vietnamese
Recognition
DRNTU::Engineering::Computer science and engineering
Xiao, Xiong
Nwe, Tin Lay
Chng, Eng Siong
Ma, Bin
Li, Haizhou
Wang, Lei
Ni, Chongjia
Leung, Cheung-Chi
You, Changhuai
Xie, Lei
Xu, Haihua
The NNi Vietnamese speech recognition system for mediaeval 2016
description This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence training, etc. Besides the acoustic modeling techniques, speech data augmentation was also examined to develop a more robust acoustic model. The I2R team collected a number of text resources from the Internet and made them available to other participants in the task. The web text crawled from the Internet was used to train a 5-gram language model. The submitted system obtained the token error rate (TER) of 15.1, 23.0 and 50.5 on Devel local set, Devel set and Test set, respectively.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Xiao, Xiong
Nwe, Tin Lay
Chng, Eng Siong
Ma, Bin
Li, Haizhou
Wang, Lei
Ni, Chongjia
Leung, Cheung-Chi
You, Changhuai
Xie, Lei
Xu, Haihua
format Conference or Workshop Item
author Xiao, Xiong
Nwe, Tin Lay
Chng, Eng Siong
Ma, Bin
Li, Haizhou
Wang, Lei
Ni, Chongjia
Leung, Cheung-Chi
You, Changhuai
Xie, Lei
Xu, Haihua
author_sort Xiao, Xiong
title The NNi Vietnamese speech recognition system for mediaeval 2016
title_short The NNi Vietnamese speech recognition system for mediaeval 2016
title_full The NNi Vietnamese speech recognition system for mediaeval 2016
title_fullStr The NNi Vietnamese speech recognition system for mediaeval 2016
title_full_unstemmed The NNi Vietnamese speech recognition system for mediaeval 2016
title_sort nni vietnamese speech recognition system for mediaeval 2016
publishDate 2019
url https://hdl.handle.net/10356/79851
http://hdl.handle.net/10220/48316
http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_52.pdf
_version_ 1681049018075971584