An efficient transformer-based model for Vietnamese punctuation prediction

In both formal and informal texts, missing punctuation marks make the texts confusing and challenging to read. This paper aims to conduct exhaustive experiments to investigate the benefits of the pre-trained Transformer-based models on two Vietnamese punctuation datasets. The experimental results sh...

Full description

Saved in:

Bibliographic Details
Main Authors:	TRAN, Hieu, DINH, Cuong V., PHAM, Hong Quang, NGUYEN, Binh T.
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2021
Subjects:	Punctuation prediction Transfer learning Transformer models Numerical Analysis and Computation South and Southeast Asian Languages and Societies Theory and Algorithms
Online Access:	https://ink.library.smu.edu.sg/sis_research/7102 https://ink.library.smu.edu.sg/context/sis_research/article/8105/viewcontent/Tran2021_AnEfficientTransformer_Vietnamese_pv.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Description
Summary:	In both formal and informal texts, missing punctuation marks make the texts confusing and challenging to read. This paper aims to conduct exhaustive experiments to investigate the benefits of the pre-trained Transformer-based models on two Vietnamese punctuation datasets. The experimental results show our models can achieve encouraging results, and adding Bi-LSTM or/and CRF layers on top of the proposed models can also boost model performance. Finally, our best model can significantly bypass state-of-the-art approaches on both the novel and news datasets for the Vietnamese language. It can gain the corresponding performance up to 21.45%21.45% and 18.27%18.27% in the overall F1-scores.

An efficient transformer-based model for Vietnamese punctuation prediction

Similar Items