Performance analysis of parallel graph algorithms on heterogeneous processors in OpenCL

Despite the fact that GPU was originally intended to be as a co-processor specializing in graphics rendering, it has recently evolved into a powerful many-core co-processor for general-purpose computation. However, a major obstacle for wide adoption of GPGPU programming is that different GPU vendors...

Full description

Saved in:

Bibliographic Details
Main Author:	Liu, Xiao
Other Authors:	He Bingsheng
Format:	Final Year Project
Language:	English
Published:	2015
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	http://hdl.handle.net/10356/62711
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-62711
record_format	dspace
spelling	sg-ntu-dr.10356-627112023-03-03T20:47:45Z Performance analysis of parallel graph algorithms on heterogeneous processors in OpenCL Liu, Xiao He Bingsheng School of Computer Engineering DRNTU::Engineering::Computer science and engineering Despite the fact that GPU was originally intended to be as a co-processor specializing in graphics rendering, it has recently evolved into a powerful many-core co-processor for general-purpose computation. However, a major obstacle for wide adoption of GPGPU programming is that different GPU vendors have their own programming languages and platforms. To address that issue, Open Computing Language (OpenCL) provides unified programming interface for various parallel computing platforms. OpenCL also enables cross-platform comparison of different hardware resources possible. Inspired by Medusa programming framework on CUDA, we set out to explore the actual speedup between heterogeneous processors for parallel programs and to examine the impact of different data storage layouts. We experimented with OpenCL PageRank algorithm and Breadth-first Search algorithm on Intel and NVIDIA graphic cards, using varied combinations of data storage layouts like Column-major Adjacency Array (CAA), Adjacency Array (AA), Structure of Array (SOA) and Array of Structure (AOS). Our experiment results showed that for graph algorithms GPUs substantially outperform CPUs in executing parallel tasks, and CAA is a more preferred memory layout than AA, especially when GPU is less sophisticated and computation is vertex-oriented. Likewise, SOA is a more preferred memory layout than AOS, especially when GPU is less sophisticated. Bachelor of Engineering (Computer Engineering) 2015-04-27T09:19:16Z 2015-04-27T09:19:16Z 2015 2015 Final Year Project (FYP) http://hdl.handle.net/10356/62711 en Nanyang Technological University 54 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Liu, Xiao Performance analysis of parallel graph algorithms on heterogeneous processors in OpenCL
description	Despite the fact that GPU was originally intended to be as a co-processor specializing in graphics rendering, it has recently evolved into a powerful many-core co-processor for general-purpose computation. However, a major obstacle for wide adoption of GPGPU programming is that different GPU vendors have their own programming languages and platforms. To address that issue, Open Computing Language (OpenCL) provides unified programming interface for various parallel computing platforms. OpenCL also enables cross-platform comparison of different hardware resources possible. Inspired by Medusa programming framework on CUDA, we set out to explore the actual speedup between heterogeneous processors for parallel programs and to examine the impact of different data storage layouts. We experimented with OpenCL PageRank algorithm and Breadth-first Search algorithm on Intel and NVIDIA graphic cards, using varied combinations of data storage layouts like Column-major Adjacency Array (CAA), Adjacency Array (AA), Structure of Array (SOA) and Array of Structure (AOS). Our experiment results showed that for graph algorithms GPUs substantially outperform CPUs in executing parallel tasks, and CAA is a more preferred memory layout than AA, especially when GPU is less sophisticated and computation is vertex-oriented. Likewise, SOA is a more preferred memory layout than AOS, especially when GPU is less sophisticated.
author2	He Bingsheng
author_facet	He Bingsheng Liu, Xiao
format	Final Year Project
author	Liu, Xiao
author_sort	Liu, Xiao
title	Performance analysis of parallel graph algorithms on heterogeneous processors in OpenCL
title_short	Performance analysis of parallel graph algorithms on heterogeneous processors in OpenCL
title_full	Performance analysis of parallel graph algorithms on heterogeneous processors in OpenCL
title_fullStr	Performance analysis of parallel graph algorithms on heterogeneous processors in OpenCL
title_full_unstemmed	Performance analysis of parallel graph algorithms on heterogeneous processors in OpenCL
title_sort	performance analysis of parallel graph algorithms on heterogeneous processors in opencl
publishDate	2015
url	http://hdl.handle.net/10356/62711
_version_	1759853961234350080

Performance analysis of parallel graph algorithms on heterogeneous processors in OpenCL

Similar Items