A SimCSE-based model for sentiment analysis in Chinese text messages
Existing sentiment analysis algorithms mainly focus on vectorized textual data representation and constructing high-quality deep learning classifiers. However, improving sentence embedding methods could enhance textual sentiment classification models. This project introduces a model for text-leve...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/177139 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-177139 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1771392024-05-24T15:56:11Z A SimCSE-based model for sentiment analysis in Chinese text messages Song, Haiyang - School of Electrical and Electronic Engineering Chen Lihui elhchen@ntu.edu.sg Computer and Information Science Engineering Existing sentiment analysis algorithms mainly focus on vectorized textual data representation and constructing high-quality deep learning classifiers. However, improving sentence embedding methods could enhance textual sentiment classification models. This project introduces a model for text-level sentiment classification utilizing contrastive learning and BERT pre-trained language models. Model combines SimCSE with self-supervised BERT training using contrastive learning. It adapts a simple text level sentiment analysis dataset into pairs through Back Translation, constructing siamese network BERTs. Each side of these BERTs shares the same structure and parameters. By feeding sentiment analysis text pairs generated through Back Translation into the BERT models, sentence representation vectors are obtained. The model optimizes by summing loss functions and back-propagating to improve performance. Finally, the onesided BERT network from the trained siamese network BERTs is applied to the supervised classification module for Chinese text sentiment classification. Experimental validation on three Chinese datasets, including Waimai 10k, chnsenticorp htl All, and online shopping 10 cats, demonstrates the effectiveness and superiority of the model over several cutting-edge text-level sentiment classification models. Keywords: Natural language processing, Sentiment Analysis; Contrastive Learning; Siamese Network. Master's degree 2024-05-23T02:46:35Z 2024-05-23T02:46:35Z 2024 Thesis-Master by Coursework Song, H. (2024). A SimCSE-based model for sentiment analysis in Chinese text messages. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/177139 https://hdl.handle.net/10356/177139 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Engineering |
spellingShingle |
Computer and Information Science Engineering Song, Haiyang A SimCSE-based model for sentiment analysis in Chinese text messages |
description |
Existing sentiment analysis algorithms mainly focus on vectorized textual data
representation and constructing high-quality deep learning classifiers. However,
improving sentence embedding methods could enhance textual sentiment classification
models. This project introduces a model for text-level sentiment classification
utilizing contrastive learning and BERT pre-trained language models.
Model combines SimCSE with self-supervised BERT training using contrastive
learning. It adapts a simple text level sentiment analysis dataset into pairs
through Back Translation, constructing siamese network BERTs. Each side of
these BERTs shares the same structure and parameters. By feeding sentiment
analysis text pairs generated through Back Translation into the BERT models,
sentence representation vectors are obtained. The model optimizes by summing
loss functions and back-propagating to improve performance. Finally, the onesided
BERT network from the trained siamese network BERTs is applied to the
supervised classification module for Chinese text sentiment classification.
Experimental validation on three Chinese datasets, including Waimai 10k, chnsenticorp
htl All, and online shopping 10 cats, demonstrates the effectiveness and
superiority of the model over several cutting-edge text-level sentiment classification
models.
Keywords: Natural language processing, Sentiment Analysis; Contrastive Learning;
Siamese Network. |
author2 |
- |
author_facet |
- Song, Haiyang |
format |
Thesis-Master by Coursework |
author |
Song, Haiyang |
author_sort |
Song, Haiyang |
title |
A SimCSE-based model for sentiment analysis in Chinese text messages |
title_short |
A SimCSE-based model for sentiment analysis in Chinese text messages |
title_full |
A SimCSE-based model for sentiment analysis in Chinese text messages |
title_fullStr |
A SimCSE-based model for sentiment analysis in Chinese text messages |
title_full_unstemmed |
A SimCSE-based model for sentiment analysis in Chinese text messages |
title_sort |
simcse-based model for sentiment analysis in chinese text messages |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/177139 |
_version_ |
1800916214064611328 |