A first speech recognition system for Mandarin-English code-switch conversational speech

This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech. We applied state-of-the-art techniques such as speaker adaptive and discriminative training to build the first baseline system on the...

Full description

Saved in:

Bibliographic Details
Main Authors:	Vu, Ngoc Thang, Lyu, Dau-Cheng, Weiner, Jochen, Telaar, Dominic, Schlippe, Tim, Blaicher, Fabian, Chng, Eng Siong, Schultz, Tanja, Li, Haizhou
Other Authors:	School of Computer Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/98522 http://hdl.handle.net/10220/13411
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-98522
record_format	dspace
spelling	sg-ntu-dr.10356-985222020-05-28T07:17:44Z A first speech recognition system for Mandarin-English code-switch conversational speech Vu, Ngoc Thang Lyu, Dau-Cheng Weiner, Jochen Telaar, Dominic Schlippe, Tim Blaicher, Fabian Chng, Eng Siong Schultz, Tanja Li, Haizhou School of Computer Engineering IEEE International Conference on Acoustics, Speech and Signal Processing (2012 : Kyoto, Japan) DRNTU::Engineering::Computer science and engineering This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech. We applied state-of-the-art techniques such as speaker adaptive and discriminative training to build the first baseline system on the SEAME corpus [1] (South East Asia Mandarin-English). For acoustic modeling, we applied different phone merging approaches based on the International Phonetic Alphabet (IPA) and Bhattacharyya distance in combination with discriminative training to improve accuracy. On language model level, we investigated statistical machine translation (SMT) - based text generation approaches for building code-switching language models. Furthermore, we integrated the provided information from a language identification system (LID) into the decoding process by using a multi-stream approach. Our best 2-pass system achieves a Mixed Error Rate (MER) of 36.6% on the SEAME development set. 2013-09-09T07:27:33Z 2019-12-06T19:56:28Z 2013-09-09T07:27:33Z 2019-12-06T19:56:28Z 2012 2012 Conference Paper Vu, N. T., Lyu, D.-C., Weiner, J., Telaar, D., Schlippe, T., Blaicher, F., & et al. (2012). A first speech recognition system for Mandarin-English code-switch conversational speech. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4889-4892. https://hdl.handle.net/10356/98522 http://hdl.handle.net/10220/13411 10.1109/ICASSP.2012.6289015 en © 2012 IEEE
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Vu, Ngoc Thang Lyu, Dau-Cheng Weiner, Jochen Telaar, Dominic Schlippe, Tim Blaicher, Fabian Chng, Eng Siong Schultz, Tanja Li, Haizhou A first speech recognition system for Mandarin-English code-switch conversational speech
description	This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech. We applied state-of-the-art techniques such as speaker adaptive and discriminative training to build the first baseline system on the SEAME corpus [1] (South East Asia Mandarin-English). For acoustic modeling, we applied different phone merging approaches based on the International Phonetic Alphabet (IPA) and Bhattacharyya distance in combination with discriminative training to improve accuracy. On language model level, we investigated statistical machine translation (SMT) - based text generation approaches for building code-switching language models. Furthermore, we integrated the provided information from a language identification system (LID) into the decoding process by using a multi-stream approach. Our best 2-pass system achieves a Mixed Error Rate (MER) of 36.6% on the SEAME development set.
author2	School of Computer Engineering
author_facet	School of Computer Engineering Vu, Ngoc Thang Lyu, Dau-Cheng Weiner, Jochen Telaar, Dominic Schlippe, Tim Blaicher, Fabian Chng, Eng Siong Schultz, Tanja Li, Haizhou
format	Conference or Workshop Item
author	Vu, Ngoc Thang Lyu, Dau-Cheng Weiner, Jochen Telaar, Dominic Schlippe, Tim Blaicher, Fabian Chng, Eng Siong Schultz, Tanja Li, Haizhou
author_sort	Vu, Ngoc Thang
title	A first speech recognition system for Mandarin-English code-switch conversational speech
title_short	A first speech recognition system for Mandarin-English code-switch conversational speech
title_full	A first speech recognition system for Mandarin-English code-switch conversational speech
title_fullStr	A first speech recognition system for Mandarin-English code-switch conversational speech
title_full_unstemmed	A first speech recognition system for Mandarin-English code-switch conversational speech
title_sort	first speech recognition system for mandarin-english code-switch conversational speech
publishDate	2013
url	https://hdl.handle.net/10356/98522 http://hdl.handle.net/10220/13411
_version_	1681056323386474496

A first speech recognition system for Mandarin-English code-switch conversational speech

Similar Items