Sentiment analysis of context word embeddings
With every technological advancement, the role of machines in our lives are getting augmented and now, more than ever, there is a need to communicate with machines naturally. Natural communication includes more than just recognition of pre-defined commands by the machines, it includes but is not lim...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/147554 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | With every technological advancement, the role of machines in our lives are getting augmented and now, more than ever, there is a need to communicate with machines naturally. Natural communication includes more than just recognition of pre-defined commands by the machines, it includes but is not limited to the understanding of contextual and sentiment information of the conversation. There is a separate branch in Computer sciences and AI which deals with the interaction between computer and human called natural language processing (NLP). NLP encompasses many aspects of communication between machine and human such as speech recognition, syntactic analysis, and lexical semantics. But this study is limited to a part of NLP which is concerned with sentiment extraction and classification of text based on its sentiment.
Classification of text is done into binary or ternary classes based on its sentiment. IMDB, SST and SemEval databases provide pre-labelled sentences specially curated for sentiment analysis tasks. The mentioned datasets are used for training and testing the classification models developed in this thesis using deep learning architecture. Embedding algorithm such as BERT framework is used for extracting word embedding. CNN and Fully connected deep neural network architecture are used to develop the classification models for classifying text in binary and ternary sentiment labels. The different classification models are compared against each other on different metrics such as Macro-F1, accuracy, precision, and recall. BERT embedding with CNN classifier is found to perform better on all datasets compared to all other classifier models discussed in this thesis. |
---|