An efficient transformer-based model for Vietnamese punctuation prediction
In both formal and informal texts, missing punctuation marks make the texts confusing and challenging to read. This paper aims to conduct exhaustive experiments to investigate the benefits of the pre-trained Transformer-based models on two Vietnamese punctuation datasets. The experimental results sh...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2021
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/7102 https://ink.library.smu.edu.sg/context/sis_research/article/8105/viewcontent/Tran2021_AnEfficientTransformer_Vietnamese_pv.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-8105 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-81052022-04-14T11:53:34Z An efficient transformer-based model for Vietnamese punctuation prediction TRAN, Hieu DINH, Cuong V. PHAM, Hong Quang NGUYEN, Binh T. In both formal and informal texts, missing punctuation marks make the texts confusing and challenging to read. This paper aims to conduct exhaustive experiments to investigate the benefits of the pre-trained Transformer-based models on two Vietnamese punctuation datasets. The experimental results show our models can achieve encouraging results, and adding Bi-LSTM or/and CRF layers on top of the proposed models can also boost model performance. Finally, our best model can significantly bypass state-of-the-art approaches on both the novel and news datasets for the Vietnamese language. It can gain the corresponding performance up to 21.45%21.45% and 18.27%18.27% in the overall F1-scores. 2021-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7102 info:doi/10.1007/978-3-030-79463-7_5 https://ink.library.smu.edu.sg/context/sis_research/article/8105/viewcontent/Tran2021_AnEfficientTransformer_Vietnamese_pv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Punctuation prediction Transfer learning Transformer models Numerical Analysis and Computation South and Southeast Asian Languages and Societies Theory and Algorithms |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Punctuation prediction Transfer learning Transformer models Numerical Analysis and Computation South and Southeast Asian Languages and Societies Theory and Algorithms |
spellingShingle |
Punctuation prediction Transfer learning Transformer models Numerical Analysis and Computation South and Southeast Asian Languages and Societies Theory and Algorithms TRAN, Hieu DINH, Cuong V. PHAM, Hong Quang NGUYEN, Binh T. An efficient transformer-based model for Vietnamese punctuation prediction |
description |
In both formal and informal texts, missing punctuation marks make the texts confusing and challenging to read. This paper aims to conduct exhaustive experiments to investigate the benefits of the pre-trained Transformer-based models on two Vietnamese punctuation datasets. The experimental results show our models can achieve encouraging results, and adding Bi-LSTM or/and CRF layers on top of the proposed models can also boost model performance. Finally, our best model can significantly bypass state-of-the-art approaches on both the novel and news datasets for the Vietnamese language. It can gain the corresponding performance up to 21.45%21.45% and 18.27%18.27% in the overall F1-scores. |
format |
text |
author |
TRAN, Hieu DINH, Cuong V. PHAM, Hong Quang NGUYEN, Binh T. |
author_facet |
TRAN, Hieu DINH, Cuong V. PHAM, Hong Quang NGUYEN, Binh T. |
author_sort |
TRAN, Hieu |
title |
An efficient transformer-based model for Vietnamese punctuation prediction |
title_short |
An efficient transformer-based model for Vietnamese punctuation prediction |
title_full |
An efficient transformer-based model for Vietnamese punctuation prediction |
title_fullStr |
An efficient transformer-based model for Vietnamese punctuation prediction |
title_full_unstemmed |
An efficient transformer-based model for Vietnamese punctuation prediction |
title_sort |
efficient transformer-based model for vietnamese punctuation prediction |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2021 |
url |
https://ink.library.smu.edu.sg/sis_research/7102 https://ink.library.smu.edu.sg/context/sis_research/article/8105/viewcontent/Tran2021_AnEfficientTransformer_Vietnamese_pv.pdf |
_version_ |
1770576212659273728 |