An efficient transformer-based model for Vietnamese punctuation prediction

In both formal and informal texts, missing punctuation marks make the texts confusing and challenging to read. This paper aims to conduct exhaustive experiments to investigate the benefits of the pre-trained Transformer-based models on two Vietnamese punctuation datasets. The experimental results sh...

Full description

Saved in:
Bibliographic Details
Main Authors: TRAN, Hieu, DINH, Cuong V., PHAM, Hong Quang, NGUYEN, Binh T.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7102
https://ink.library.smu.edu.sg/context/sis_research/article/8105/viewcontent/Tran2021_AnEfficientTransformer_Vietnamese_pv.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8105
record_format dspace
spelling sg-smu-ink.sis_research-81052022-04-14T11:53:34Z An efficient transformer-based model for Vietnamese punctuation prediction TRAN, Hieu DINH, Cuong V. PHAM, Hong Quang NGUYEN, Binh T. In both formal and informal texts, missing punctuation marks make the texts confusing and challenging to read. This paper aims to conduct exhaustive experiments to investigate the benefits of the pre-trained Transformer-based models on two Vietnamese punctuation datasets. The experimental results show our models can achieve encouraging results, and adding Bi-LSTM or/and CRF layers on top of the proposed models can also boost model performance. Finally, our best model can significantly bypass state-of-the-art approaches on both the novel and news datasets for the Vietnamese language. It can gain the corresponding performance up to 21.45%21.45% and 18.27%18.27% in the overall F1-scores. 2021-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7102 info:doi/10.1007/978-3-030-79463-7_5 https://ink.library.smu.edu.sg/context/sis_research/article/8105/viewcontent/Tran2021_AnEfficientTransformer_Vietnamese_pv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Punctuation prediction Transfer learning Transformer models Numerical Analysis and Computation South and Southeast Asian Languages and Societies Theory and Algorithms
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Punctuation prediction
Transfer learning
Transformer models
Numerical Analysis and Computation
South and Southeast Asian Languages and Societies
Theory and Algorithms
spellingShingle Punctuation prediction
Transfer learning
Transformer models
Numerical Analysis and Computation
South and Southeast Asian Languages and Societies
Theory and Algorithms
TRAN, Hieu
DINH, Cuong V.
PHAM, Hong Quang
NGUYEN, Binh T.
An efficient transformer-based model for Vietnamese punctuation prediction
description In both formal and informal texts, missing punctuation marks make the texts confusing and challenging to read. This paper aims to conduct exhaustive experiments to investigate the benefits of the pre-trained Transformer-based models on two Vietnamese punctuation datasets. The experimental results show our models can achieve encouraging results, and adding Bi-LSTM or/and CRF layers on top of the proposed models can also boost model performance. Finally, our best model can significantly bypass state-of-the-art approaches on both the novel and news datasets for the Vietnamese language. It can gain the corresponding performance up to 21.45%21.45% and 18.27%18.27% in the overall F1-scores.
format text
author TRAN, Hieu
DINH, Cuong V.
PHAM, Hong Quang
NGUYEN, Binh T.
author_facet TRAN, Hieu
DINH, Cuong V.
PHAM, Hong Quang
NGUYEN, Binh T.
author_sort TRAN, Hieu
title An efficient transformer-based model for Vietnamese punctuation prediction
title_short An efficient transformer-based model for Vietnamese punctuation prediction
title_full An efficient transformer-based model for Vietnamese punctuation prediction
title_fullStr An efficient transformer-based model for Vietnamese punctuation prediction
title_full_unstemmed An efficient transformer-based model for Vietnamese punctuation prediction
title_sort efficient transformer-based model for vietnamese punctuation prediction
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/7102
https://ink.library.smu.edu.sg/context/sis_research/article/8105/viewcontent/Tran2021_AnEfficientTransformer_Vietnamese_pv.pdf
_version_ 1770576212659273728