The NNi Vietnamese speech recognition system for mediaeval 2016
This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence tr...
Saved in:
Main Authors: | , , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/79851 http://hdl.handle.net/10220/48316 http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_52.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-79851 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-798512019-12-06T13:35:20Z The NNi Vietnamese speech recognition system for mediaeval 2016 Xiao, Xiong Nwe, Tin Lay Chng, Eng Siong Ma, Bin Li, Haizhou Wang, Lei Ni, Chongjia Leung, Cheung-Chi You, Changhuai Xie, Lei Xu, Haihua School of Computer Science and Engineering Multimedia Benchmark Workshop Vietnamese Recognition DRNTU::Engineering::Computer science and engineering This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence training, etc. Besides the acoustic modeling techniques, speech data augmentation was also examined to develop a more robust acoustic model. The I2R team collected a number of text resources from the Internet and made them available to other participants in the task. The web text crawled from the Internet was used to train a 5-gram language model. The submitted system obtained the token error rate (TER) of 15.1, 23.0 and 50.5 on Devel local set, Devel set and Test set, respectively. Published version 2019-05-22T05:09:30Z 2019-12-06T13:35:20Z 2019-05-22T05:09:30Z 2019-12-06T13:35:20Z 2016 Conference Paper Wang, L., Ni, C., Leung, C. -C., You, C., Xie, L., Xu, H., . . . Li, H. (2016). The NNi Vietnamese speech recognition system for mediaeval 2016. Multimedia Benchmark Workshop, 1739. https://hdl.handle.net/10356/79851 http://hdl.handle.net/10220/48316 http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_52.pdf en © 2016 The Author(s). 3 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Vietnamese Recognition DRNTU::Engineering::Computer science and engineering |
spellingShingle |
Vietnamese Recognition DRNTU::Engineering::Computer science and engineering Xiao, Xiong Nwe, Tin Lay Chng, Eng Siong Ma, Bin Li, Haizhou Wang, Lei Ni, Chongjia Leung, Cheung-Chi You, Changhuai Xie, Lei Xu, Haihua The NNi Vietnamese speech recognition system for mediaeval 2016 |
description |
This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence training, etc. Besides the acoustic modeling techniques, speech data augmentation was also examined to develop a more robust acoustic model. The I2R team collected a number of text resources from the Internet and made them available to other participants in the task. The web text crawled from the Internet was used to train a 5-gram language model. The submitted system obtained the token error rate (TER) of 15.1, 23.0 and 50.5 on Devel local set, Devel set and Test set, respectively. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Xiao, Xiong Nwe, Tin Lay Chng, Eng Siong Ma, Bin Li, Haizhou Wang, Lei Ni, Chongjia Leung, Cheung-Chi You, Changhuai Xie, Lei Xu, Haihua |
format |
Conference or Workshop Item |
author |
Xiao, Xiong Nwe, Tin Lay Chng, Eng Siong Ma, Bin Li, Haizhou Wang, Lei Ni, Chongjia Leung, Cheung-Chi You, Changhuai Xie, Lei Xu, Haihua |
author_sort |
Xiao, Xiong |
title |
The NNi Vietnamese speech recognition system for mediaeval 2016 |
title_short |
The NNi Vietnamese speech recognition system for mediaeval 2016 |
title_full |
The NNi Vietnamese speech recognition system for mediaeval 2016 |
title_fullStr |
The NNi Vietnamese speech recognition system for mediaeval 2016 |
title_full_unstemmed |
The NNi Vietnamese speech recognition system for mediaeval 2016 |
title_sort |
nni vietnamese speech recognition system for mediaeval 2016 |
publishDate |
2019 |
url |
https://hdl.handle.net/10356/79851 http://hdl.handle.net/10220/48316 http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_52.pdf |
_version_ |
1681049018075971584 |