Big data tasks execution time analysis using machine learning techniques

Big data and its analysis are in the focus of current era. The volume of data production is tremendous and a significant part of delivered data is not utilized because of the limited assets to store and process them efficiently. The world acclaimed platform that can efficiently deal with the giganti...

Full description

Saved in:
Bibliographic Details
Main Authors: Shabbir, A., Abu Bakar, K., Radzi, R. Z. R. M., Siraj, M.
Format: Conference or Workshop Item
Language:English
Published: 2019
Subjects:
Online Access:http://eprints.utm.my/id/eprint/91088/1/AishaShabbir2019_BigDataTasksExecutionTime.pdf
http://eprints.utm.my/id/eprint/91088/
http://www.ieomsociety.org/ieom2019/papers/665.pdf.
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
Description
Summary:Big data and its analysis are in the focus of current era. The volume of data production is tremendous and a significant part of delivered data is not utilized because of the limited assets to store and process them efficiently. The world acclaimed platform that can efficiently deal with the gigantic amount of data in a cost effective manner is Hadoop MapReduce. In order to effectively utilize any computational platform, information about the components affecting its performance is necessary. Similarly, Hadoop MapReduce's performance can be enhanced by identifying those factors that can affect its performance. Some researchers provided some schemes for improving total task completion time of big data tasks on Hadoop MapReduce by suitable selection and scheduling of processing units i.e. mappers. However, reducers are still underexplored for its effect on the total execution time. This paper aimed at evaluation of reducer's impact on total execution time of big data tasks on Hadoop MapReduce by employing machine learning techniques. The evaluation has been carried out both analytically and experimentally by changing different number of reducers across various types and length of tasks. The results clearly depicts the dependence of total MapReduce task execution time on the number of reducers.