Collection and annotation of Malay conversational speech corpus
We report the development of a Malay conversational speech corpus as part of our research in spontaneous conversational speech LVCSR. This corpus development effort is the collaboration between NTU and USM. The goal is to collect, transcribe, and annotate 50 hours of conversational Malay speech. The...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2013
|
Online Access: | https://hdl.handle.net/10356/98798 http://hdl.handle.net/10220/12670 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-98798 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-987982020-05-28T07:17:22Z Collection and annotation of Malay conversational speech corpus Chong, Tze Yuang Xiao, Xiong Tan, Tien-Ping Chng, Eng Siong Li, Haizhou School of Computer Engineering International Conference on Speech Database and Assessments (2012 : Macau) We report the development of a Malay conversational speech corpus as part of our research in spontaneous conversational speech LVCSR. This corpus development effort is the collaboration between NTU and USM. The goal is to collect, transcribe, and annotate 50 hours of conversational Malay speech. The conversation is recorded from both close-talk and telephone channels, and both speakers' utterances are kept in separate tracks. Besides the word transcription, we also annotate linguistics phenomena such as fillers and disfluencies. To date, 20 hours have been recorded, transcribed and analyzed. The details of our analysis will be presented in this report. 2013-07-31T08:42:59Z 2019-12-06T19:59:46Z 2013-07-31T08:42:59Z 2019-12-06T19:59:46Z 2012 2012 Conference Paper Chong, T. Y., Xiao, X., Tan, T. P., Chng, E. S.,& Li, H. (2012). Collection and annotation of Malay conversational speech corpus. 2012 International Conference on Speech Database and Assessments, 30-35. https://hdl.handle.net/10356/98798 http://hdl.handle.net/10220/12670 10.1109/ICSDA.2012.6422473 en |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
description |
We report the development of a Malay conversational speech corpus as part of our research in spontaneous conversational speech LVCSR. This corpus development effort is the collaboration between NTU and USM. The goal is to collect, transcribe, and annotate 50 hours of conversational Malay speech. The conversation is recorded from both close-talk and telephone channels, and both speakers' utterances are kept in separate tracks. Besides the word transcription, we also annotate linguistics phenomena such as fillers and disfluencies. To date, 20 hours have been recorded, transcribed and analyzed. The details of our analysis will be presented in this report. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Chong, Tze Yuang Xiao, Xiong Tan, Tien-Ping Chng, Eng Siong Li, Haizhou |
format |
Conference or Workshop Item |
author |
Chong, Tze Yuang Xiao, Xiong Tan, Tien-Ping Chng, Eng Siong Li, Haizhou |
spellingShingle |
Chong, Tze Yuang Xiao, Xiong Tan, Tien-Ping Chng, Eng Siong Li, Haizhou Collection and annotation of Malay conversational speech corpus |
author_sort |
Chong, Tze Yuang |
title |
Collection and annotation of Malay conversational speech corpus |
title_short |
Collection and annotation of Malay conversational speech corpus |
title_full |
Collection and annotation of Malay conversational speech corpus |
title_fullStr |
Collection and annotation of Malay conversational speech corpus |
title_full_unstemmed |
Collection and annotation of Malay conversational speech corpus |
title_sort |
collection and annotation of malay conversational speech corpus |
publishDate |
2013 |
url |
https://hdl.handle.net/10356/98798 http://hdl.handle.net/10220/12670 |
_version_ |
1681059565712441344 |