Contrastive learning approach to word-in-context task for low-resource languages

Word in context (WiC) task aims to determine whether a target word’s occurrences in two sentences share the same sense. In this paper, we propose a Contrastive Learning WiC (CLWiC) framework to improve the learning of sentence/word representations and classification of target word senses in the sent...

Full description

Saved in:

Bibliographic Details
Main Authors:	LO, Pei-Chi, LEE, Yang-Yin, CHEN, Hsien-Hao, KWEE, Agus Trisnajaya, LIM, Ee-peng
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2023
Subjects:	Databases and Information Systems Programming Languages and Compilers
Online Access:	https://ink.library.smu.edu.sg/sis_research/8327 https://ink.library.smu.edu.sg/context/sis_research/article/9330/viewcontent/014_Contrastive_poster.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9330
record_format	dspace
spelling	sg-smu-ink.sis_research-93302023-12-05T03:02:39Z Contrastive learning approach to word-in-context task for low-resource languages LO, Pei-Chi LEE, Yang-Yin CHEN, Hsien-Hao KWEE, Agus Trisnajaya LIM, Ee-peng Word in context (WiC) task aims to determine whether a target word’s occurrences in two sentences share the same sense. In this paper, we propose a Contrastive Learning WiC (CLWiC) framework to improve the learning of sentence/word representations and classification of target word senses in the sentence pair when performing WiC on lowresource languages. In representation learning, CLWiC trains a pre-trained language model’s ability to cope with lowresource languages using both unsupervised and supervised contrastive learning. The WiC classifier learning further finetunes the language model with WiC classification loss under two classifier architecture options, SGBERT and WiSBERT, which use single-encoder and dual-encoder for encoding a WiC task instance respectively. We evaluate the models developed based on CLWiC framework on a new WiC dataset constructed for Singlish, a low-resource English creole language used in Singapore, as well as the standard English WiC benchmark dataset. Our experiments show that CLWiC-based models using both unsupervised and supervised contrastive learning outperform those not using contrastive learning. This performance difference is more substantial for the Singlish dataset than for the English dataset. Unsupervised contrastive learning appears to improve WiC performance more than supervised one. Finally, we show that using joint learning strategy, we can achieve the best WiC performance. 2023-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8327 https://ink.library.smu.edu.sg/context/sis_research/article/9330/viewcontent/014_Contrastive_poster.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Programming Languages and Compilers
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Databases and Information Systems Programming Languages and Compilers
spellingShingle	Databases and Information Systems Programming Languages and Compilers LO, Pei-Chi LEE, Yang-Yin CHEN, Hsien-Hao KWEE, Agus Trisnajaya LIM, Ee-peng Contrastive learning approach to word-in-context task for low-resource languages
description	Word in context (WiC) task aims to determine whether a target word’s occurrences in two sentences share the same sense. In this paper, we propose a Contrastive Learning WiC (CLWiC) framework to improve the learning of sentence/word representations and classification of target word senses in the sentence pair when performing WiC on lowresource languages. In representation learning, CLWiC trains a pre-trained language model’s ability to cope with lowresource languages using both unsupervised and supervised contrastive learning. The WiC classifier learning further finetunes the language model with WiC classification loss under two classifier architecture options, SGBERT and WiSBERT, which use single-encoder and dual-encoder for encoding a WiC task instance respectively. We evaluate the models developed based on CLWiC framework on a new WiC dataset constructed for Singlish, a low-resource English creole language used in Singapore, as well as the standard English WiC benchmark dataset. Our experiments show that CLWiC-based models using both unsupervised and supervised contrastive learning outperform those not using contrastive learning. This performance difference is more substantial for the Singlish dataset than for the English dataset. Unsupervised contrastive learning appears to improve WiC performance more than supervised one. Finally, we show that using joint learning strategy, we can achieve the best WiC performance.
format	text
author	LO, Pei-Chi LEE, Yang-Yin CHEN, Hsien-Hao KWEE, Agus Trisnajaya LIM, Ee-peng
author_facet	LO, Pei-Chi LEE, Yang-Yin CHEN, Hsien-Hao KWEE, Agus Trisnajaya LIM, Ee-peng
author_sort	LO, Pei-Chi
title	Contrastive learning approach to word-in-context task for low-resource languages
title_short	Contrastive learning approach to word-in-context task for low-resource languages
title_full	Contrastive learning approach to word-in-context task for low-resource languages
title_fullStr	Contrastive learning approach to word-in-context task for low-resource languages
title_full_unstemmed	Contrastive learning approach to word-in-context task for low-resource languages
title_sort	contrastive learning approach to word-in-context task for low-resource languages
publisher	Institutional Knowledge at Singapore Management University
publishDate	2023
url	https://ink.library.smu.edu.sg/sis_research/8327 https://ink.library.smu.edu.sg/context/sis_research/article/9330/viewcontent/014_Contrastive_poster.pdf
_version_	1784855635538477056

Contrastive learning approach to word-in-context task for low-resource languages

Similar Items