Classification on big data set using data analytics techniques

The advancement of big data allows data analytics to grow with the increase in the amount of information that can be processed. As information is more readily available, programs can be created to extract, analyse and classify online social media messages and comments. Existing word dictionaries are...

Full description

Saved in:
Bibliographic Details
Main Author: Chung, Ka Wai
Other Authors: Chan Chee Keong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/138594
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-138594
record_format dspace
spelling sg-ntu-dr.10356-1385942023-07-07T18:09:55Z Classification on big data set using data analytics techniques Chung, Ka Wai Chan Chee Keong School of Electrical and Electronic Engineering eckchan@ntu.edu.sg Engineering::Electrical and electronic engineering::Computer hardware, software and systems The advancement of big data allows data analytics to grow with the increase in the amount of information that can be processed. As information is more readily available, programs can be created to extract, analyse and classify online social media messages and comments. Existing word dictionaries are based on old literature text and documents and are unable to pick up slang used by users of the internet, as well as languages that are an amalgamation of different dialects and languages such as Singlish. The project aims to create a classification model based on a localised dataset of an online message board to be able to categorise comments whether they are positive, negative or neutral in sentiment. A total of 3 concepts of classification were explored and 5 different models were generated to obtain an accuracy ranging from 57%-64%. A voting classifier consisting of the combination of all 5 models resulted in a higher accuracy of 65.5%. A chatbot was also programmed and interaction with the classification models to evaluate the sentiment of user input. This project can be utilised in social data analytics and metrics to gauge feedback of online comments for news and updates. Bachelor of Engineering (Electrical and Electronic Engineering) 2020-05-11T02:16:38Z 2020-05-11T02:16:38Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/138594 en A3039-191 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering::Computer hardware, software and systems
spellingShingle Engineering::Electrical and electronic engineering::Computer hardware, software and systems
Chung, Ka Wai
Classification on big data set using data analytics techniques
description The advancement of big data allows data analytics to grow with the increase in the amount of information that can be processed. As information is more readily available, programs can be created to extract, analyse and classify online social media messages and comments. Existing word dictionaries are based on old literature text and documents and are unable to pick up slang used by users of the internet, as well as languages that are an amalgamation of different dialects and languages such as Singlish. The project aims to create a classification model based on a localised dataset of an online message board to be able to categorise comments whether they are positive, negative or neutral in sentiment. A total of 3 concepts of classification were explored and 5 different models were generated to obtain an accuracy ranging from 57%-64%. A voting classifier consisting of the combination of all 5 models resulted in a higher accuracy of 65.5%. A chatbot was also programmed and interaction with the classification models to evaluate the sentiment of user input. This project can be utilised in social data analytics and metrics to gauge feedback of online comments for news and updates.
author2 Chan Chee Keong
author_facet Chan Chee Keong
Chung, Ka Wai
format Final Year Project
author Chung, Ka Wai
author_sort Chung, Ka Wai
title Classification on big data set using data analytics techniques
title_short Classification on big data set using data analytics techniques
title_full Classification on big data set using data analytics techniques
title_fullStr Classification on big data set using data analytics techniques
title_full_unstemmed Classification on big data set using data analytics techniques
title_sort classification on big data set using data analytics techniques
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/138594
_version_ 1772829160139390976