Vietnamese punctuation prediction using deep neural networks

Adding appropriate punctuation marks into text is an essential step in speech-to-text where such information is usually not available. While this has been extensively studied for English, there is no large-scale dataset and comprehensive study in the punctuation prediction problem for the Vietnamese...

Full description

Saved in:
Bibliographic Details
Main Authors: PHAM, Thuy, NGUYEN, Nhu, PHAM, Hong Quang, CAO, Han, NGUYEN, Binh
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7817
https://ink.library.smu.edu.sg/context/sis_research/article/8820/viewcontent/VietnamesePunc_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8820
record_format dspace
spelling sg-smu-ink.sis_research-88202023-04-25T06:12:23Z Vietnamese punctuation prediction using deep neural networks PHAM, Thuy NGUYEN, Nhu PHAM, Hong Quang CAO, Han NGUYEN, Binh Adding appropriate punctuation marks into text is an essential step in speech-to-text where such information is usually not available. While this has been extensively studied for English, there is no large-scale dataset and comprehensive study in the punctuation prediction problem for the Vietnamese language. In this paper, we collect two massive datasets and conduct a benchmark with both traditional methods and deep neural networks. We aim to publish both our data and all implementation codes to facilitate further research, not only in Vietnamese punctuation prediction but also in other related fields. Our project, including datasets and implementation details, is publicly available at https://github.com/BinhMisfit/vietnamese-punctuation-prediction. 2020-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7817 info:doi/10.1007/978-3-030-38919-2_32 https://ink.library.smu.edu.sg/context/sis_research/article/8820/viewcontent/VietnamesePunc_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Attention model BiLSTM Conditional random field Punctuation prediction Numerical Analysis and Scientific Computing South and Southeast Asian Languages and Societies
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Attention model
BiLSTM
Conditional random field
Punctuation prediction
Numerical Analysis and Scientific Computing
South and Southeast Asian Languages and Societies
spellingShingle Attention model
BiLSTM
Conditional random field
Punctuation prediction
Numerical Analysis and Scientific Computing
South and Southeast Asian Languages and Societies
PHAM, Thuy
NGUYEN, Nhu
PHAM, Hong Quang
CAO, Han
NGUYEN, Binh
Vietnamese punctuation prediction using deep neural networks
description Adding appropriate punctuation marks into text is an essential step in speech-to-text where such information is usually not available. While this has been extensively studied for English, there is no large-scale dataset and comprehensive study in the punctuation prediction problem for the Vietnamese language. In this paper, we collect two massive datasets and conduct a benchmark with both traditional methods and deep neural networks. We aim to publish both our data and all implementation codes to facilitate further research, not only in Vietnamese punctuation prediction but also in other related fields. Our project, including datasets and implementation details, is publicly available at https://github.com/BinhMisfit/vietnamese-punctuation-prediction.
format text
author PHAM, Thuy
NGUYEN, Nhu
PHAM, Hong Quang
CAO, Han
NGUYEN, Binh
author_facet PHAM, Thuy
NGUYEN, Nhu
PHAM, Hong Quang
CAO, Han
NGUYEN, Binh
author_sort PHAM, Thuy
title Vietnamese punctuation prediction using deep neural networks
title_short Vietnamese punctuation prediction using deep neural networks
title_full Vietnamese punctuation prediction using deep neural networks
title_fullStr Vietnamese punctuation prediction using deep neural networks
title_full_unstemmed Vietnamese punctuation prediction using deep neural networks
title_sort vietnamese punctuation prediction using deep neural networks
publisher Institutional Knowledge at Singapore Management University
publishDate 2020
url https://ink.library.smu.edu.sg/sis_research/7817
https://ink.library.smu.edu.sg/context/sis_research/article/8820/viewcontent/VietnamesePunc_av.pdf
_version_ 1770576518993412096