Machine learning on Mars GPU map-reduce framework

We are on the multi-core and big data era. Even though a large number of researches are related with parallel computing and machine learning, few of them have focused on combining them together. This report is an investigation on implementing machine learning algorithms on Mars GPU Map-Reduce framew...

Full description

Saved in:
Bibliographic Details
Main Author: Xi, Yewen
Other Authors: School of Computer Engineering
Format: Final Year Project
Language:English
Published: 2014
Subjects:
Online Access:http://hdl.handle.net/10356/55770
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:We are on the multi-core and big data era. Even though a large number of researches are related with parallel computing and machine learning, few of them have focused on combining them together. This report is an investigation on implementing machine learning algorithms on Mars GPU Map-Reduce framework to achieve better computation performance and analytics of big data. Three machine learning algorithms, neural network, principal component analysis and independent component analysis have been implemented. It was found that with increasing data size, the Map-Reduce GPU program has a faster speed than sequential program running on CPU. It is because that with multi-cores, GPU could process data in a parallel way which is much more efficient than CPU. In addition, two Map-Reduce GPU framework Mars and MapCG were compared. With benchmark of few applications, MapCG shows higher efficiency than Mars.The main reason is that MapCG uses hash table to group intermediate key/value pairs instead of sorting used in Mars. In conclusion, those results suggest that Map-Reduce GPU framework could be used for better analytics on big data. Further studies could be done by comparing more machine learning algorithms or other applications, in order to find some other influence ways about how to further improve computing performance.