A first speech recognition system for Mandarin-English code-switch conversational speech

This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech. We applied state-of-the-art techniques such as speaker adaptive and discriminative training to build the first baseline system on the...

全面介紹

Saved in:
書目詳細資料
Main Authors: Vu, Ngoc Thang, Lyu, Dau-Cheng, Weiner, Jochen, Telaar, Dominic, Schlippe, Tim, Blaicher, Fabian, Chng, Eng Siong, Schultz, Tanja, Li, Haizhou
其他作者: School of Computer Engineering
格式: Conference or Workshop Item
語言:English
出版: 2013
主題:
在線閱讀:https://hdl.handle.net/10356/98522
http://hdl.handle.net/10220/13411
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech. We applied state-of-the-art techniques such as speaker adaptive and discriminative training to build the first baseline system on the SEAME corpus [1] (South East Asia Mandarin-English). For acoustic modeling, we applied different phone merging approaches based on the International Phonetic Alphabet (IPA) and Bhattacharyya distance in combination with discriminative training to improve accuracy. On language model level, we investigated statistical machine translation (SMT) - based text generation approaches for building code-switching language models. Furthermore, we integrated the provided information from a language identification system (LID) into the decoding process by using a multi-stream approach. Our best 2-pass system achieves a Mixed Error Rate (MER) of 36.6% on the SEAME development set.