An improved CUDA-based implementation of differential evolution on GPU

Modern GPUs enable widely affordable personal computers to carry out massively parallel computation tasks. NVIDIA's CUDA technology provides a wieldy parallel computing platform. Many state-of-the-art algorithms arising from different fields have been redesigned based on CUDA to achieve computa...

Full description

Saved in:

Bibliographic Details
Main Authors:	Raimondo, Federico, Forbes, Florence, Ong, Yew Soon, Qin, A. K.
Other Authors:	School of Computer Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/100559 http://hdl.handle.net/10220/16289
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-100559
record_format	dspace
spelling	sg-ntu-dr.10356-1005592020-05-28T07:17:57Z An improved CUDA-based implementation of differential evolution on GPU Raimondo, Federico Forbes, Florence Ong, Yew Soon Qin, A. K. School of Computer Engineering International conference on Genetic and evolutionary computation conference (14th : 2012) DRNTU::Engineering::Computer science and engineering Modern GPUs enable widely affordable personal computers to carry out massively parallel computation tasks. NVIDIA's CUDA technology provides a wieldy parallel computing platform. Many state-of-the-art algorithms arising from different fields have been redesigned based on CUDA to achieve computational speedup. Differential evolution (DE), as a very promising evolutionary algorithm, is highly suitable for parallelization owing to its data-parallel algorithmic structure. However, most existing CUDA-based DE implementations suffer from excessive low-throughput memory access and less efficient device utilization. This work presents an improved CUDA-based DE to optimize memory and device utilization: several logically-related kernels are combined into one composite kernel to reduce global memory access; kernel execution configuration parameters are automatically determined to maximize device occupancy; streams are employed to enable concurrent kernel execution to maximize device utilization. Experimental results on several numerical problems demonstrate superior computational time efficiency of the proposed method over two recent CUDA-based DE and the sequential DE across varying problem dimensions and algorithmic population sizes. 2013-10-04T07:50:30Z 2019-12-06T20:24:27Z 2013-10-04T07:50:30Z 2019-12-06T20:24:27Z 2012 2012 Conference Paper Qin, A. K., Raimondo, F., Forbes, F., & Ong, Y. S. (2012). An improved CUDA-based implementation of differential evolution on GPU. Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference, 991-998. https://hdl.handle.net/10356/100559 http://hdl.handle.net/10220/16289 10.1145/2330163.2330301 en
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Raimondo, Federico Forbes, Florence Ong, Yew Soon Qin, A. K. An improved CUDA-based implementation of differential evolution on GPU
description	Modern GPUs enable widely affordable personal computers to carry out massively parallel computation tasks. NVIDIA's CUDA technology provides a wieldy parallel computing platform. Many state-of-the-art algorithms arising from different fields have been redesigned based on CUDA to achieve computational speedup. Differential evolution (DE), as a very promising evolutionary algorithm, is highly suitable for parallelization owing to its data-parallel algorithmic structure. However, most existing CUDA-based DE implementations suffer from excessive low-throughput memory access and less efficient device utilization. This work presents an improved CUDA-based DE to optimize memory and device utilization: several logically-related kernels are combined into one composite kernel to reduce global memory access; kernel execution configuration parameters are automatically determined to maximize device occupancy; streams are employed to enable concurrent kernel execution to maximize device utilization. Experimental results on several numerical problems demonstrate superior computational time efficiency of the proposed method over two recent CUDA-based DE and the sequential DE across varying problem dimensions and algorithmic population sizes.
author2	School of Computer Engineering
author_facet	School of Computer Engineering Raimondo, Federico Forbes, Florence Ong, Yew Soon Qin, A. K.
format	Conference or Workshop Item
author	Raimondo, Federico Forbes, Florence Ong, Yew Soon Qin, A. K.
author_sort	Raimondo, Federico
title	An improved CUDA-based implementation of differential evolution on GPU
title_short	An improved CUDA-based implementation of differential evolution on GPU
title_full	An improved CUDA-based implementation of differential evolution on GPU
title_fullStr	An improved CUDA-based implementation of differential evolution on GPU
title_full_unstemmed	An improved CUDA-based implementation of differential evolution on GPU
title_sort	improved cuda-based implementation of differential evolution on gpu
publishDate	2013
url	https://hdl.handle.net/10356/100559 http://hdl.handle.net/10220/16289
_version_	1681059210660413440

An improved CUDA-based implementation of differential evolution on GPU

Similar Items