Sentiment analysis of context word embeddings

With every technological advancement, the role of machines in our lives are getting augmented and now, more than ever, there is a need to communicate with machines naturally. Natural communication includes more than just recognition of pre-defined commands by the machines, it includes but is not lim...

Full description

Saved in:
Bibliographic Details
Main Author: Khan, Mohammad Sadique
Other Authors: Ponnuthurai Nagaratnam Suganthan
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/147554
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With every technological advancement, the role of machines in our lives are getting augmented and now, more than ever, there is a need to communicate with machines naturally. Natural communication includes more than just recognition of pre-defined commands by the machines, it includes but is not limited to the understanding of contextual and sentiment information of the conversation. There is a separate branch in Computer sciences and AI which deals with the interaction between computer and human called natural language processing (NLP). NLP encompasses many aspects of communication between machine and human such as speech recognition, syntactic analysis, and lexical semantics. But this study is limited to a part of NLP which is concerned with sentiment extraction and classification of text based on its sentiment. Classification of text is done into binary or ternary classes based on its sentiment. IMDB, SST and SemEval databases provide pre-labelled sentences specially curated for sentiment analysis tasks. The mentioned datasets are used for training and testing the classification models developed in this thesis using deep learning architecture. Embedding algorithm such as BERT framework is used for extracting word embedding. CNN and Fully connected deep neural network architecture are used to develop the classification models for classifying text in binary and ternary sentiment labels. The different classification models are compared against each other on different metrics such as Macro-F1, accuracy, precision, and recall. BERT embedding with CNN classifier is found to perform better on all datasets compared to all other classifier models discussed in this thesis.