Sentiment analysis on movie reviews

With the rapid growth in the digital world, people are active on the internet with their smart devices anytime to share their opinions on any online platform. A large amount of unstructured information is generated every day to be mined and turned into meaningful digital outputs. Natural Langua...

Full description

Saved in:
Bibliographic Details
Main Author: Wang, Wen
Other Authors: Sun Aixin
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/153196
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-153196
record_format dspace
spelling sg-ntu-dr.10356-1531962021-11-16T05:19:12Z Sentiment analysis on movie reviews Wang, Wen Sun Aixin School of Computer Science and Engineering AXSun@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence With the rapid growth in the digital world, people are active on the internet with their smart devices anytime to share their opinions on any online platform. A large amount of unstructured information is generated every day to be mined and turned into meaningful digital outputs. Natural Language Processing (NLP) aims to extract information from the raw text and derive desired insights by performing different computational tasks. Sentiment analysis is one of the NLP tasks which is also known as opinion mining. This task extracts people’s opinions, feelings, and emotions by analysing the textual data. It has shown its research value through various real-life applications such as collecting customer feedback, performing product analysis, and monitoring the company brand and its reputation. Sentiment analysis has been studied by many researchers for decades with many remarkable solutions using rule-based and machine learning approaches. This project aims to evaluate the performance of a supervised machine learning approach in sentiment analysis. Binary sentiment classification and fine-grained classification are the two subtasks of sentiment analysis. It is more challenging to perform a fine-grained classification problem as it expands polarity into five levels: very positive, positive, neutral, negative, and very negative. It requires models to make a precise prediction as there is a higher probability of making a wrong prediction. Thus, this project focuses on binary sentiment classification to predict the text into binary classes: positive and negative. The approach consists of a group of traditional classification algorithms and neural networks. The traditional classification algorithms such as Naïve Bayes, Support Vector Machine, Logistic Regression, and more are implemented to predict the polarity of text into positive or negative. In addition, various neural networks including Convolutional Neural Network, Recurrent Neural Network, Recurrent Neural Network are implemented. Lastly, different word vectorization or word embedding methods are used to evaluate their impacts on the performance of models. By analysing and comparing the models, SVM and NB have outperformed other models in traditional classifiers with TF-IDF as an optimal word vectorization method. In neural network approach, the RNNs models have outperformed other models. The training time is reduced significantly with GloVe word embedding while having comparable performance as the models with embedding layers from Keras Library. Bachelor of Engineering (Computer Science) 2021-11-16T01:27:53Z 2021-11-16T01:27:53Z 2021 Final Year Project (FYP) Wang, W. (2021). Sentiment analysis on movie reviews. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/153196 https://hdl.handle.net/10356/153196 en SCSE20-0952 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Wang, Wen
Sentiment analysis on movie reviews
description With the rapid growth in the digital world, people are active on the internet with their smart devices anytime to share their opinions on any online platform. A large amount of unstructured information is generated every day to be mined and turned into meaningful digital outputs. Natural Language Processing (NLP) aims to extract information from the raw text and derive desired insights by performing different computational tasks. Sentiment analysis is one of the NLP tasks which is also known as opinion mining. This task extracts people’s opinions, feelings, and emotions by analysing the textual data. It has shown its research value through various real-life applications such as collecting customer feedback, performing product analysis, and monitoring the company brand and its reputation. Sentiment analysis has been studied by many researchers for decades with many remarkable solutions using rule-based and machine learning approaches. This project aims to evaluate the performance of a supervised machine learning approach in sentiment analysis. Binary sentiment classification and fine-grained classification are the two subtasks of sentiment analysis. It is more challenging to perform a fine-grained classification problem as it expands polarity into five levels: very positive, positive, neutral, negative, and very negative. It requires models to make a precise prediction as there is a higher probability of making a wrong prediction. Thus, this project focuses on binary sentiment classification to predict the text into binary classes: positive and negative. The approach consists of a group of traditional classification algorithms and neural networks. The traditional classification algorithms such as Naïve Bayes, Support Vector Machine, Logistic Regression, and more are implemented to predict the polarity of text into positive or negative. In addition, various neural networks including Convolutional Neural Network, Recurrent Neural Network, Recurrent Neural Network are implemented. Lastly, different word vectorization or word embedding methods are used to evaluate their impacts on the performance of models. By analysing and comparing the models, SVM and NB have outperformed other models in traditional classifiers with TF-IDF as an optimal word vectorization method. In neural network approach, the RNNs models have outperformed other models. The training time is reduced significantly with GloVe word embedding while having comparable performance as the models with embedding layers from Keras Library.
author2 Sun Aixin
author_facet Sun Aixin
Wang, Wen
format Final Year Project
author Wang, Wen
author_sort Wang, Wen
title Sentiment analysis on movie reviews
title_short Sentiment analysis on movie reviews
title_full Sentiment analysis on movie reviews
title_fullStr Sentiment analysis on movie reviews
title_full_unstemmed Sentiment analysis on movie reviews
title_sort sentiment analysis on movie reviews
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/153196
_version_ 1718368070264684544