A SimCSE-based model for sentiment analysis in Chinese text messages

Existing sentiment analysis algorithms mainly focus on vectorized textual data representation and constructing high-quality deep learning classifiers. However, improving sentence embedding methods could enhance textual sentiment classification models. This project introduces a model for text-leve...

Full description

Saved in:
Bibliographic Details
Main Author: Song, Haiyang
Other Authors: -
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/177139
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-177139
record_format dspace
spelling sg-ntu-dr.10356-1771392024-05-24T15:56:11Z A SimCSE-based model for sentiment analysis in Chinese text messages Song, Haiyang - School of Electrical and Electronic Engineering Chen Lihui elhchen@ntu.edu.sg Computer and Information Science Engineering Existing sentiment analysis algorithms mainly focus on vectorized textual data representation and constructing high-quality deep learning classifiers. However, improving sentence embedding methods could enhance textual sentiment classification models. This project introduces a model for text-level sentiment classification utilizing contrastive learning and BERT pre-trained language models. Model combines SimCSE with self-supervised BERT training using contrastive learning. It adapts a simple text level sentiment analysis dataset into pairs through Back Translation, constructing siamese network BERTs. Each side of these BERTs shares the same structure and parameters. By feeding sentiment analysis text pairs generated through Back Translation into the BERT models, sentence representation vectors are obtained. The model optimizes by summing loss functions and back-propagating to improve performance. Finally, the onesided BERT network from the trained siamese network BERTs is applied to the supervised classification module for Chinese text sentiment classification. Experimental validation on three Chinese datasets, including Waimai 10k, chnsenticorp htl All, and online shopping 10 cats, demonstrates the effectiveness and superiority of the model over several cutting-edge text-level sentiment classification models. Keywords: Natural language processing, Sentiment Analysis; Contrastive Learning; Siamese Network. Master's degree 2024-05-23T02:46:35Z 2024-05-23T02:46:35Z 2024 Thesis-Master by Coursework Song, H. (2024). A SimCSE-based model for sentiment analysis in Chinese text messages. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/177139 https://hdl.handle.net/10356/177139 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Engineering
spellingShingle Computer and Information Science
Engineering
Song, Haiyang
A SimCSE-based model for sentiment analysis in Chinese text messages
description Existing sentiment analysis algorithms mainly focus on vectorized textual data representation and constructing high-quality deep learning classifiers. However, improving sentence embedding methods could enhance textual sentiment classification models. This project introduces a model for text-level sentiment classification utilizing contrastive learning and BERT pre-trained language models. Model combines SimCSE with self-supervised BERT training using contrastive learning. It adapts a simple text level sentiment analysis dataset into pairs through Back Translation, constructing siamese network BERTs. Each side of these BERTs shares the same structure and parameters. By feeding sentiment analysis text pairs generated through Back Translation into the BERT models, sentence representation vectors are obtained. The model optimizes by summing loss functions and back-propagating to improve performance. Finally, the onesided BERT network from the trained siamese network BERTs is applied to the supervised classification module for Chinese text sentiment classification. Experimental validation on three Chinese datasets, including Waimai 10k, chnsenticorp htl All, and online shopping 10 cats, demonstrates the effectiveness and superiority of the model over several cutting-edge text-level sentiment classification models. Keywords: Natural language processing, Sentiment Analysis; Contrastive Learning; Siamese Network.
author2 -
author_facet -
Song, Haiyang
format Thesis-Master by Coursework
author Song, Haiyang
author_sort Song, Haiyang
title A SimCSE-based model for sentiment analysis in Chinese text messages
title_short A SimCSE-based model for sentiment analysis in Chinese text messages
title_full A SimCSE-based model for sentiment analysis in Chinese text messages
title_fullStr A SimCSE-based model for sentiment analysis in Chinese text messages
title_full_unstemmed A SimCSE-based model for sentiment analysis in Chinese text messages
title_sort simcse-based model for sentiment analysis in chinese text messages
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/177139
_version_ 1800916214064611328