Collection and annotation of Malay conversational speech corpus

We report the development of a Malay conversational speech corpus as part of our research in spontaneous conversational speech LVCSR. This corpus development effort is the collaboration between NTU and USM. The goal is to collect, transcribe, and annotate 50 hours of conversational Malay speech. The...

Full description

Saved in:
Bibliographic Details
Main Authors: Chong, Tze Yuang, Xiao, Xiong, Tan, Tien-Ping, Chng, Eng Siong, Li, Haizhou
Other Authors: School of Computer Engineering
Format: Conference or Workshop Item
Language:English
Published: 2013
Online Access:https://hdl.handle.net/10356/98798
http://hdl.handle.net/10220/12670
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-98798
record_format dspace
spelling sg-ntu-dr.10356-987982020-05-28T07:17:22Z Collection and annotation of Malay conversational speech corpus Chong, Tze Yuang Xiao, Xiong Tan, Tien-Ping Chng, Eng Siong Li, Haizhou School of Computer Engineering International Conference on Speech Database and Assessments (2012 : Macau) We report the development of a Malay conversational speech corpus as part of our research in spontaneous conversational speech LVCSR. This corpus development effort is the collaboration between NTU and USM. The goal is to collect, transcribe, and annotate 50 hours of conversational Malay speech. The conversation is recorded from both close-talk and telephone channels, and both speakers' utterances are kept in separate tracks. Besides the word transcription, we also annotate linguistics phenomena such as fillers and disfluencies. To date, 20 hours have been recorded, transcribed and analyzed. The details of our analysis will be presented in this report. 2013-07-31T08:42:59Z 2019-12-06T19:59:46Z 2013-07-31T08:42:59Z 2019-12-06T19:59:46Z 2012 2012 Conference Paper Chong, T. Y., Xiao, X., Tan, T. P., Chng, E. S.,& Li, H. (2012). Collection and annotation of Malay conversational speech corpus. 2012 International Conference on Speech Database and Assessments, 30-35. https://hdl.handle.net/10356/98798 http://hdl.handle.net/10220/12670 10.1109/ICSDA.2012.6422473 en
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
description We report the development of a Malay conversational speech corpus as part of our research in spontaneous conversational speech LVCSR. This corpus development effort is the collaboration between NTU and USM. The goal is to collect, transcribe, and annotate 50 hours of conversational Malay speech. The conversation is recorded from both close-talk and telephone channels, and both speakers' utterances are kept in separate tracks. Besides the word transcription, we also annotate linguistics phenomena such as fillers and disfluencies. To date, 20 hours have been recorded, transcribed and analyzed. The details of our analysis will be presented in this report.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Chong, Tze Yuang
Xiao, Xiong
Tan, Tien-Ping
Chng, Eng Siong
Li, Haizhou
format Conference or Workshop Item
author Chong, Tze Yuang
Xiao, Xiong
Tan, Tien-Ping
Chng, Eng Siong
Li, Haizhou
spellingShingle Chong, Tze Yuang
Xiao, Xiong
Tan, Tien-Ping
Chng, Eng Siong
Li, Haizhou
Collection and annotation of Malay conversational speech corpus
author_sort Chong, Tze Yuang
title Collection and annotation of Malay conversational speech corpus
title_short Collection and annotation of Malay conversational speech corpus
title_full Collection and annotation of Malay conversational speech corpus
title_fullStr Collection and annotation of Malay conversational speech corpus
title_full_unstemmed Collection and annotation of Malay conversational speech corpus
title_sort collection and annotation of malay conversational speech corpus
publishDate 2013
url https://hdl.handle.net/10356/98798
http://hdl.handle.net/10220/12670
_version_ 1681059565712441344