WORKER BALANCING IMPLEMENTATION ON DRAGON SCHEDULER FOR DISTRIBUTED DEEP LEARNING IN KUBERNETES

Deep learning is generally have much more computational process than conventional machine learning, so it requires a lot of time for the training process. Distributed deep learning is an alternative approach to reduce training time by distributing the computational load across multiple machines....

Full description

Saved in:

Bibliographic Details
Main Author:	Prima Yoriko, Naufal
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/65753
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:65753
spelling	id-itb.:657532022-06-24T14:59:25ZWORKER BALANCING IMPLEMENTATION ON DRAGON SCHEDULER FOR DISTRIBUTED DEEP LEARNING IN KUBERNETES Prima Yoriko, Naufal Indonesia Final Project DRAGON scheduler, worker balancing, deep learning, parameter server, job scheduling, Kubernetes, Tensorflow INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/65753 Deep learning is generally have much more computational process than conventional machine learning, so it requires a lot of time for the training process. Distributed deep learning is an alternative approach to reduce training time by distributing the computational load across multiple machines. DRAGON scheduler is a scheduler that is used to schedule various distributed training tasks using parameter server architecture with Tensorflow on a Kubernetes cluster. The DRAGON scheduler has the advantage of being able to scale the number of workers from a training job depending on the availability of resources in the cluster. Based on the implementation of scaling on the DRAGON scheduler, the process of adding and subtracting workers is focused on one job first. However, it was found that the implementation being inefficient in terms of training duration because a limitation of parameter server architecture. So, due to these limitations, it is necessary to modify the scaling process in the DRAGON scheduler by implementing worker balancing, which was implemented in this Final Project. In the DRAGON scheduler that is modified using worker balancing, the duration of the training can be reduced by 16.305% while maintaining the accuracy of the prediction of training results. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Deep learning is generally have much more computational process than conventional machine learning, so it requires a lot of time for the training process. Distributed deep learning is an alternative approach to reduce training time by distributing the computational load across multiple machines. DRAGON scheduler is a scheduler that is used to schedule various distributed training tasks using parameter server architecture with Tensorflow on a Kubernetes cluster. The DRAGON scheduler has the advantage of being able to scale the number of workers from a training job depending on the availability of resources in the cluster. Based on the implementation of scaling on the DRAGON scheduler, the process of adding and subtracting workers is focused on one job first. However, it was found that the implementation being inefficient in terms of training duration because a limitation of parameter server architecture. So, due to these limitations, it is necessary to modify the scaling process in the DRAGON scheduler by implementing worker balancing, which was implemented in this Final Project. In the DRAGON scheduler that is modified using worker balancing, the duration of the training can be reduced by 16.305% while maintaining the accuracy of the prediction of training results.
format	Final Project
author	Prima Yoriko, Naufal
spellingShingle	Prima Yoriko, Naufal WORKER BALANCING IMPLEMENTATION ON DRAGON SCHEDULER FOR DISTRIBUTED DEEP LEARNING IN KUBERNETES
author_facet	Prima Yoriko, Naufal
author_sort	Prima Yoriko, Naufal
title	WORKER BALANCING IMPLEMENTATION ON DRAGON SCHEDULER FOR DISTRIBUTED DEEP LEARNING IN KUBERNETES
title_short	WORKER BALANCING IMPLEMENTATION ON DRAGON SCHEDULER FOR DISTRIBUTED DEEP LEARNING IN KUBERNETES
title_full	WORKER BALANCING IMPLEMENTATION ON DRAGON SCHEDULER FOR DISTRIBUTED DEEP LEARNING IN KUBERNETES
title_fullStr	WORKER BALANCING IMPLEMENTATION ON DRAGON SCHEDULER FOR DISTRIBUTED DEEP LEARNING IN KUBERNETES
title_full_unstemmed	WORKER BALANCING IMPLEMENTATION ON DRAGON SCHEDULER FOR DISTRIBUTED DEEP LEARNING IN KUBERNETES
title_sort	worker balancing implementation on dragon scheduler for distributed deep learning in kubernetes
url	https://digilib.itb.ac.id/gdl/view/65753
_version_	1822932842453139456

WORKER BALANCING IMPLEMENTATION ON DRAGON SCHEDULER FOR DISTRIBUTED DEEP LEARNING IN KUBERNETES

Similar Items