Development of Java-versioned extreme learning machine and its parallelism using MapReduce

Parallel computing is regarded as the trend in today’s data processing area. Through the idea of parallelism, people are seeking for powerful tools that can handle larger data amount in faster speed and higher precision. This project is dedicated to explore possibilities in performance enhancement o...

Full description

Saved in:
Bibliographic Details
Main Author: Deng, Yuchen
Other Authors: Huang Guangbin
Format: Final Year Project
Language:English
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/10356/49672
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Parallel computing is regarded as the trend in today’s data processing area. Through the idea of parallelism, people are seeking for powerful tools that can handle larger data amount in faster speed and higher precision. This project is dedicated to explore possibilities in performance enhancement of the enabling technology of Extreme Learning Machine by combining the idea of parallel computing. MapReduce, the programming model of Cloud Computing written in Java and originally proposed by Google Inc, is chosen to be deployed. Due to intellectual property issue, open sourced MapReduce model, by the name of Apache Hadoop MapReduce, is used in our project. In light of the nature of MapReduce which is written in Java, conventional Extreme Learning Machine is firstly developed in Java and then part of the computation is further paralleled using MapReduce. Performance of Java-versioned Extreme Learning Machine is tested and benchmarked with existing experimental data of its MatLab version. Pseudo distributed Hadoop MapReduce framework is setup and replaces the matrix multiplication portion of Extreme Learning Machine. Unfortunately, due to compatibility issue, this part of the code can’t be successfully executed, leaving the performance untested. Development and installation processes are thoroughly explained with source code attached in appendix.