Performance profiling and optimizations in distributed deep learning frameworks

Deep learning has been a very popular topic in Artificial Intelligent industry these years and can be applied to many fields, such as computer vision, natural language processing and so forth. However, training a deep learning model usually takes lots of time. It is necessary to identify the bottlen...

Full description

Saved in:

Bibliographic Details
Main Author:	Zhang, Jiarui
Other Authors:	Lin Zhiping
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2021
Subjects:	Engineering::Electrical and electronic engineering
Online Access:	https://hdl.handle.net/10356/149382
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-149382
record_format	dspace
spelling	sg-ntu-dr.10356-1493822023-07-07T18:13:42Z Performance profiling and optimizations in distributed deep learning frameworks Zhang, Jiarui Lin Zhiping School of Electrical and Electronic Engineering YITU Pte Ltd Wang Li EZPLin@ntu.edu.sg Engineering::Electrical and electronic engineering Deep learning has been a very popular topic in Artificial Intelligent industry these years and can be applied to many fields, such as computer vision, natural language processing and so forth. However, training a deep learning model usually takes lots of time. It is necessary to identify the bottleneck of the deep learning process and implement optimizations on them to improve the training efficiency, especially the training speed. Usually, optimizations are implemented in two aspects: data processing and model training. In this work, multiple optimization methods are studied and conducted to check their corresponding effect. Regarding data processing, optimizations such as parallelization of multiple transforming processes, dataset caching, prefetching of data samples are implemented. Regarding training, data parallelism of distributed training is especially studied, and two current popular frameworks are utilized to achieve it. Experiments are conducted to compare the two frameworks and analyze possible influencing factors’ effect on the training speed. Bachelor of Engineering (Electrical and Electronic Engineering) 2021-05-31T01:14:08Z 2021-05-31T01:14:08Z 2021 Final Year Project (FYP) Zhang, J. (2021). Performance profiling and optimizations in distributed deep learning frameworks. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/149382 https://hdl.handle.net/10356/149382 en B3137-201 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering
spellingShingle	Engineering::Electrical and electronic engineering Zhang, Jiarui Performance profiling and optimizations in distributed deep learning frameworks
description	Deep learning has been a very popular topic in Artificial Intelligent industry these years and can be applied to many fields, such as computer vision, natural language processing and so forth. However, training a deep learning model usually takes lots of time. It is necessary to identify the bottleneck of the deep learning process and implement optimizations on them to improve the training efficiency, especially the training speed. Usually, optimizations are implemented in two aspects: data processing and model training. In this work, multiple optimization methods are studied and conducted to check their corresponding effect. Regarding data processing, optimizations such as parallelization of multiple transforming processes, dataset caching, prefetching of data samples are implemented. Regarding training, data parallelism of distributed training is especially studied, and two current popular frameworks are utilized to achieve it. Experiments are conducted to compare the two frameworks and analyze possible influencing factors’ effect on the training speed.
author2	Lin Zhiping
author_facet	Lin Zhiping Zhang, Jiarui
format	Final Year Project
author	Zhang, Jiarui
author_sort	Zhang, Jiarui
title	Performance profiling and optimizations in distributed deep learning frameworks
title_short	Performance profiling and optimizations in distributed deep learning frameworks
title_full	Performance profiling and optimizations in distributed deep learning frameworks
title_fullStr	Performance profiling and optimizations in distributed deep learning frameworks
title_full_unstemmed	Performance profiling and optimizations in distributed deep learning frameworks
title_sort	performance profiling and optimizations in distributed deep learning frameworks
publisher	Nanyang Technological University
publishDate	2021
url	https://hdl.handle.net/10356/149382
_version_	1772825396348190720

Performance profiling and optimizations in distributed deep learning frameworks

Similar Items