Loop unroll optimization for GPU implementation

This report presents the process of implementation and optimization of two image resize algorithms namely, Bilinear and Bicubic Interpolation. The purpose of the optimization seeks to improve execution time and is primarily done with the use of Nvidia’s Compute Unified Device Architecture (CUDA). Bo...

Full description

Saved in:

Bibliographic Details
Main Author:	Wu, Jianghua.
Other Authors:	School of Computer Engineering
Format:	Final Year Project
Language:	English
Published:	2012
Subjects:	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Online Access:	http://hdl.handle.net/10356/48561
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-48561
record_format	dspace
spelling	sg-ntu-dr.10356-485612023-03-03T20:51:26Z Loop unroll optimization for GPU implementation Wu, Jianghua. School of Computer Engineering Zhang Wei DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision This report presents the process of implementation and optimization of two image resize algorithms namely, Bilinear and Bicubic Interpolation. The purpose of the optimization seeks to improve execution time and is primarily done with the use of Nvidia’s Compute Unified Device Architecture (CUDA). Both Algorithms are implemented in C++ before subsequent CUDA codes are added. The challenge in the project is to pick up CUDA programming and also the requirement of understanding the math involved before converting into algorithms. It was evident how the integration of CUDA, by substituting the use of loops in computations with threads running in parallel demonstrated a significant speed up in execution time. There is still room for code refactoring, better CUDA implementation and use of more powerful of Graphics Processing Unit (GPU) that will see improvements to both design and greater optimization of the developed application. In conclusion, the project has shown that under certain conditions, the leveraging the power of by the use of CUDA is a viable optimization tool in Graphics Processing Algorithms such as the ones mentioned above. Bachelor of Engineering (Computer Engineering) 2012-04-26T04:20:06Z 2012-04-26T04:20:06Z 2012 2012 Final Year Project (FYP) http://hdl.handle.net/10356/48561 en Nanyang Technological University 100 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Wu, Jianghua. Loop unroll optimization for GPU implementation
description	This report presents the process of implementation and optimization of two image resize algorithms namely, Bilinear and Bicubic Interpolation. The purpose of the optimization seeks to improve execution time and is primarily done with the use of Nvidia’s Compute Unified Device Architecture (CUDA). Both Algorithms are implemented in C++ before subsequent CUDA codes are added. The challenge in the project is to pick up CUDA programming and also the requirement of understanding the math involved before converting into algorithms. It was evident how the integration of CUDA, by substituting the use of loops in computations with threads running in parallel demonstrated a significant speed up in execution time. There is still room for code refactoring, better CUDA implementation and use of more powerful of Graphics Processing Unit (GPU) that will see improvements to both design and greater optimization of the developed application. In conclusion, the project has shown that under certain conditions, the leveraging the power of by the use of CUDA is a viable optimization tool in Graphics Processing Algorithms such as the ones mentioned above.
author2	School of Computer Engineering
author_facet	School of Computer Engineering Wu, Jianghua.
format	Final Year Project
author	Wu, Jianghua.
author_sort	Wu, Jianghua.
title	Loop unroll optimization for GPU implementation
title_short	Loop unroll optimization for GPU implementation
title_full	Loop unroll optimization for GPU implementation
title_fullStr	Loop unroll optimization for GPU implementation
title_full_unstemmed	Loop unroll optimization for GPU implementation
title_sort	loop unroll optimization for gpu implementation
publishDate	2012
url	http://hdl.handle.net/10356/48561
_version_	1759853489783046144

Loop unroll optimization for GPU implementation

Similar Items