Autonomous learning machine on online big data analytics
The term “Big Data” refers to a proportion of dataset, which does not allow existing database management tools to retrieve, store, handle and analyze. Although the big data is often affiliated with the topic of volume, researchers in the field have found that it is inherent to other 4Vs: Variety,...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2019
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/76918 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-76918 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-769182023-03-03T20:30:53Z Autonomous learning machine on online big data analytics Lim, Yan Jun Mahardhika Pratama School of Computer Science and Engineering Centre for Computational Intelligence DRNTU::Engineering::Computer science and engineering The term “Big Data” refers to a proportion of dataset, which does not allow existing database management tools to retrieve, store, handle and analyze. Although the big data is often affiliated with the topic of volume, researchers in the field have found that it is inherent to other 4Vs: Variety, Velocity, Veracity, Velocity, etc. Different data analytic tools have been suggested. One commonly and widely used approach is the so-called MapReduce from Google. Nevertheless, most of existing works are offline in nature, because it expects full access of complete dataset and enable a machine learning algorithm to achieve multiple passes over all data. In this project, an online parallelization technique is developed, with integration of an Autonomous Learning Machine (ALMA). In addition, a data fusion technique is also developed, which will merge the product of ALMA from different parallelized data partitions. Both techniques are developed using R programming in the RStudio environment, and Apache Spark. Bachelor of Engineering (Computer Science) 2019-04-23T14:04:58Z 2019-04-23T14:04:58Z 2019 Final Year Project (FYP) http://hdl.handle.net/10356/76918 en Nanyang Technological University 51 p. application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering |
spellingShingle |
DRNTU::Engineering::Computer science and engineering Lim, Yan Jun Autonomous learning machine on online big data analytics |
description |
The term “Big Data” refers to a proportion of dataset, which does not allow existing
database management tools to retrieve, store, handle and analyze.
Although the big data is often affiliated with the topic of volume, researchers in the field
have found that it is inherent to other 4Vs: Variety, Velocity, Veracity, Velocity, etc.
Different data analytic tools have been suggested. One commonly and widely used
approach is the so-called MapReduce from Google.
Nevertheless, most of existing works are offline in nature, because it expects full access of
complete dataset and enable a machine learning algorithm to achieve multiple passes over
all data.
In this project, an online parallelization technique is developed, with integration of an
Autonomous Learning Machine (ALMA). In addition, a data fusion technique is also
developed, which will merge the product of ALMA from different parallelized data
partitions. Both techniques are developed using R programming in the RStudio
environment, and Apache Spark. |
author2 |
Mahardhika Pratama |
author_facet |
Mahardhika Pratama Lim, Yan Jun |
format |
Final Year Project |
author |
Lim, Yan Jun |
author_sort |
Lim, Yan Jun |
title |
Autonomous learning machine on online big data analytics |
title_short |
Autonomous learning machine on online big data analytics |
title_full |
Autonomous learning machine on online big data analytics |
title_fullStr |
Autonomous learning machine on online big data analytics |
title_full_unstemmed |
Autonomous learning machine on online big data analytics |
title_sort |
autonomous learning machine on online big data analytics |
publisher |
Nanyang Technological University |
publishDate |
2019 |
url |
http://hdl.handle.net/10356/76918 |
_version_ |
1759858379906351104 |