Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System

Hadoop is widely adopted as a big data processing application as it can run on commercial hardware at a reasonable time. Hadoop uses asynchronous blocking concurrency using Thread and Future class. Therefore, in some cases such as network link or hardware failure, a running task may block other task...

Full description

Saved in:
Bibliographic Details
Main Authors: Khoiruddin, A.A., Zakaria, N., Alhussian, H.
Format: Article
Published: Insight Society 2020
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097534444&doi=10.18517%2fijaseit.10.5.9073&partnerID=40&md5=06bb7f5641e7773e25a91dc64089e2e0
http://eprints.utp.edu.my/23113/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Petronas
id my.utp.eprints.23113
record_format eprints
spelling my.utp.eprints.231132021-08-19T05:26:36Z Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System Khoiruddin, A.A. Zakaria, N. Alhussian, H. Hadoop is widely adopted as a big data processing application as it can run on commercial hardware at a reasonable time. Hadoop uses asynchronous blocking concurrency using Thread and Future class. Therefore, in some cases such as network link or hardware failure, a running task may block other tasks from running (the task becomes straggler). Hadoop releases are equipped with algorithms to handle straggler tasks problem. However, the algorithms manage Map and Reduce task similarly, while the straggler root cause might be different for both tasks. In this paper, the Asynchronous Non-Blocking (ANB) method is proposed to improve the performance and avoid the blocking of Reduce task in Hadoop. Instead of using the single queue, our approach uses two queues, i.e. task queue and callback queue. When a task is not ready or detected as a straggler, it is removed from the main task queue and temporarily sent to the callback queue. When the task is ready to run, it will be sent back to the main task queue for running. The performance of the algorithm is compared with rTuner, the latest paper found on handling straggler task in Reduce task. From the comparison, it is shown that ANB consistently gives faster time to complete because any unready tasks will be directly put into the callback queue without blocking other tasks. Furthermore, the overhead time in rTuner is high as it needs to check the straggler status and to find the reason for a task to become straggler. © Insight Society 2020 Article NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097534444&doi=10.18517%2fijaseit.10.5.9073&partnerID=40&md5=06bb7f5641e7773e25a91dc64089e2e0 Khoiruddin, A.A. and Zakaria, N. and Alhussian, H. (2020) Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System. International Journal on Advanced Science, Engineering and Information Technology, 10 (5). pp. 1913-1919. http://eprints.utp.edu.my/23113/
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description Hadoop is widely adopted as a big data processing application as it can run on commercial hardware at a reasonable time. Hadoop uses asynchronous blocking concurrency using Thread and Future class. Therefore, in some cases such as network link or hardware failure, a running task may block other tasks from running (the task becomes straggler). Hadoop releases are equipped with algorithms to handle straggler tasks problem. However, the algorithms manage Map and Reduce task similarly, while the straggler root cause might be different for both tasks. In this paper, the Asynchronous Non-Blocking (ANB) method is proposed to improve the performance and avoid the blocking of Reduce task in Hadoop. Instead of using the single queue, our approach uses two queues, i.e. task queue and callback queue. When a task is not ready or detected as a straggler, it is removed from the main task queue and temporarily sent to the callback queue. When the task is ready to run, it will be sent back to the main task queue for running. The performance of the algorithm is compared with rTuner, the latest paper found on handling straggler task in Reduce task. From the comparison, it is shown that ANB consistently gives faster time to complete because any unready tasks will be directly put into the callback queue without blocking other tasks. Furthermore, the overhead time in rTuner is high as it needs to check the straggler status and to find the reason for a task to become straggler. ©
format Article
author Khoiruddin, A.A.
Zakaria, N.
Alhussian, H.
spellingShingle Khoiruddin, A.A.
Zakaria, N.
Alhussian, H.
Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
author_facet Khoiruddin, A.A.
Zakaria, N.
Alhussian, H.
author_sort Khoiruddin, A.A.
title Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_short Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_full Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_fullStr Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_full_unstemmed Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
title_sort asynchronous non-blocking algorithm to handle straggler reduce tasks in hadoop system
publisher Insight Society
publishDate 2020
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097534444&doi=10.18517%2fijaseit.10.5.9073&partnerID=40&md5=06bb7f5641e7773e25a91dc64089e2e0
http://eprints.utp.edu.my/23113/
_version_ 1738656426047307776