A study of deep learning on many-core processors

Deep learning becomes a hot topic recently in various areas, from industry to academia. More and more applications developed attract a lot of public attention. Some problems in deep learning are still challenging research topics. With big data, training time is one of the major concerns to design...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Peng
Other Authors: He Bingsheng
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/66954
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-66954
record_format dspace
spelling sg-ntu-dr.10356-669542023-03-03T20:51:16Z A study of deep learning on many-core processors Chen, Peng He Bingsheng School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition Deep learning becomes a hot topic recently in various areas, from industry to academia. More and more applications developed attract a lot of public attention. Some problems in deep learning are still challenging research topics. With big data, training time is one of the major concerns to design a deep network. Parallel processing seems to be a solution to reduce training time tremendously. This project aims to investigate a few strategies to train a deep network in parallel on Apache Singa platform. The strategies consist training in synchronous mode, in asynchronous mode and in GPU. Although many factors determine the quality of the network design, in this project, training time is a major concern. Training time of a deep network can be reduced to a very large extends by using GPU training, and achieve a small speedup by multi-process training on a single machine. A complete analysis may be done by also considering scalability factor and measure the performance in a cluster. Bachelor of Engineering (Computer Science) 2016-05-06T07:04:13Z 2016-05-06T07:04:13Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/66954 en Nanyang Technological University 53 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
spellingShingle DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Chen, Peng
A study of deep learning on many-core processors
description Deep learning becomes a hot topic recently in various areas, from industry to academia. More and more applications developed attract a lot of public attention. Some problems in deep learning are still challenging research topics. With big data, training time is one of the major concerns to design a deep network. Parallel processing seems to be a solution to reduce training time tremendously. This project aims to investigate a few strategies to train a deep network in parallel on Apache Singa platform. The strategies consist training in synchronous mode, in asynchronous mode and in GPU. Although many factors determine the quality of the network design, in this project, training time is a major concern. Training time of a deep network can be reduced to a very large extends by using GPU training, and achieve a small speedup by multi-process training on a single machine. A complete analysis may be done by also considering scalability factor and measure the performance in a cluster.
author2 He Bingsheng
author_facet He Bingsheng
Chen, Peng
format Final Year Project
author Chen, Peng
author_sort Chen, Peng
title A study of deep learning on many-core processors
title_short A study of deep learning on many-core processors
title_full A study of deep learning on many-core processors
title_fullStr A study of deep learning on many-core processors
title_full_unstemmed A study of deep learning on many-core processors
title_sort study of deep learning on many-core processors
publishDate 2016
url http://hdl.handle.net/10356/66954
_version_ 1759857883191705600