Autonomous learning machine on online big data analytics

The term “Big Data” refers to a proportion of dataset, which does not allow existing database management tools to retrieve, store, handle and analyze. Although the big data is often affiliated with the topic of volume, researchers in the field have found that it is inherent to other 4Vs: Variety,...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Yan Jun
Other Authors: Mahardhika Pratama
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2019
Subjects:
Online Access:http://hdl.handle.net/10356/76918
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-76918
record_format dspace
spelling sg-ntu-dr.10356-769182023-03-03T20:30:53Z Autonomous learning machine on online big data analytics Lim, Yan Jun Mahardhika Pratama School of Computer Science and Engineering Centre for Computational Intelligence DRNTU::Engineering::Computer science and engineering The term “Big Data” refers to a proportion of dataset, which does not allow existing database management tools to retrieve, store, handle and analyze. Although the big data is often affiliated with the topic of volume, researchers in the field have found that it is inherent to other 4Vs: Variety, Velocity, Veracity, Velocity, etc. Different data analytic tools have been suggested. One commonly and widely used approach is the so-called MapReduce from Google. Nevertheless, most of existing works are offline in nature, because it expects full access of complete dataset and enable a machine learning algorithm to achieve multiple passes over all data. In this project, an online parallelization technique is developed, with integration of an Autonomous Learning Machine (ALMA). In addition, a data fusion technique is also developed, which will merge the product of ALMA from different parallelized data partitions. Both techniques are developed using R programming in the RStudio environment, and Apache Spark. Bachelor of Engineering (Computer Science) 2019-04-23T14:04:58Z 2019-04-23T14:04:58Z 2019 Final Year Project (FYP) http://hdl.handle.net/10356/76918 en Nanyang Technological University 51 p. application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Lim, Yan Jun
Autonomous learning machine on online big data analytics
description The term “Big Data” refers to a proportion of dataset, which does not allow existing database management tools to retrieve, store, handle and analyze. Although the big data is often affiliated with the topic of volume, researchers in the field have found that it is inherent to other 4Vs: Variety, Velocity, Veracity, Velocity, etc. Different data analytic tools have been suggested. One commonly and widely used approach is the so-called MapReduce from Google. Nevertheless, most of existing works are offline in nature, because it expects full access of complete dataset and enable a machine learning algorithm to achieve multiple passes over all data. In this project, an online parallelization technique is developed, with integration of an Autonomous Learning Machine (ALMA). In addition, a data fusion technique is also developed, which will merge the product of ALMA from different parallelized data partitions. Both techniques are developed using R programming in the RStudio environment, and Apache Spark.
author2 Mahardhika Pratama
author_facet Mahardhika Pratama
Lim, Yan Jun
format Final Year Project
author Lim, Yan Jun
author_sort Lim, Yan Jun
title Autonomous learning machine on online big data analytics
title_short Autonomous learning machine on online big data analytics
title_full Autonomous learning machine on online big data analytics
title_fullStr Autonomous learning machine on online big data analytics
title_full_unstemmed Autonomous learning machine on online big data analytics
title_sort autonomous learning machine on online big data analytics
publisher Nanyang Technological University
publishDate 2019
url http://hdl.handle.net/10356/76918
_version_ 1759858379906351104