Vietnamese punctuation prediction using deep neural networks
Adding appropriate punctuation marks into text is an essential step in speech-to-text where such information is usually not available. While this has been extensively studied for English, there is no large-scale dataset and comprehensive study in the punctuation prediction problem for the Vietnamese...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2020
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/7817 https://ink.library.smu.edu.sg/context/sis_research/article/8820/viewcontent/VietnamesePunc_av.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-8820 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-88202023-04-25T06:12:23Z Vietnamese punctuation prediction using deep neural networks PHAM, Thuy NGUYEN, Nhu PHAM, Hong Quang CAO, Han NGUYEN, Binh Adding appropriate punctuation marks into text is an essential step in speech-to-text where such information is usually not available. While this has been extensively studied for English, there is no large-scale dataset and comprehensive study in the punctuation prediction problem for the Vietnamese language. In this paper, we collect two massive datasets and conduct a benchmark with both traditional methods and deep neural networks. We aim to publish both our data and all implementation codes to facilitate further research, not only in Vietnamese punctuation prediction but also in other related fields. Our project, including datasets and implementation details, is publicly available at https://github.com/BinhMisfit/vietnamese-punctuation-prediction. 2020-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7817 info:doi/10.1007/978-3-030-38919-2_32 https://ink.library.smu.edu.sg/context/sis_research/article/8820/viewcontent/VietnamesePunc_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Attention model BiLSTM Conditional random field Punctuation prediction Numerical Analysis and Scientific Computing South and Southeast Asian Languages and Societies |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Attention model BiLSTM Conditional random field Punctuation prediction Numerical Analysis and Scientific Computing South and Southeast Asian Languages and Societies |
spellingShingle |
Attention model BiLSTM Conditional random field Punctuation prediction Numerical Analysis and Scientific Computing South and Southeast Asian Languages and Societies PHAM, Thuy NGUYEN, Nhu PHAM, Hong Quang CAO, Han NGUYEN, Binh Vietnamese punctuation prediction using deep neural networks |
description |
Adding appropriate punctuation marks into text is an essential step in speech-to-text where such information is usually not available. While this has been extensively studied for English, there is no large-scale dataset and comprehensive study in the punctuation prediction problem for the Vietnamese language. In this paper, we collect two massive datasets and conduct a benchmark with both traditional methods and deep neural networks. We aim to publish both our data and all implementation codes to facilitate further research, not only in Vietnamese punctuation prediction but also in other related fields. Our project, including datasets and implementation details, is publicly available at https://github.com/BinhMisfit/vietnamese-punctuation-prediction. |
format |
text |
author |
PHAM, Thuy NGUYEN, Nhu PHAM, Hong Quang CAO, Han NGUYEN, Binh |
author_facet |
PHAM, Thuy NGUYEN, Nhu PHAM, Hong Quang CAO, Han NGUYEN, Binh |
author_sort |
PHAM, Thuy |
title |
Vietnamese punctuation prediction using deep neural networks |
title_short |
Vietnamese punctuation prediction using deep neural networks |
title_full |
Vietnamese punctuation prediction using deep neural networks |
title_fullStr |
Vietnamese punctuation prediction using deep neural networks |
title_full_unstemmed |
Vietnamese punctuation prediction using deep neural networks |
title_sort |
vietnamese punctuation prediction using deep neural networks |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2020 |
url |
https://ink.library.smu.edu.sg/sis_research/7817 https://ink.library.smu.edu.sg/context/sis_research/article/8820/viewcontent/VietnamesePunc_av.pdf |
_version_ |
1770576518993412096 |