Fast and scalable data mining algorithms for clustering using graphics processors
The tremendous growth of data in the order of zeta bytes in various real-life databases has fortified the need for high speed and scalable data mining methods. Computing applications for data warehouses and data streams are highly arithmetic and computationally intensive, thus demanding high perform...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/55278 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The tremendous growth of data in the order of zeta bytes in various real-life databases has fortified the need for high speed and scalable data mining methods. Computing applications for data warehouses and data streams are highly arithmetic and computationally intensive, thus demanding high performance data processing methods. Clustering techniques at large form the foundational building blocks in data mining, machine learning, etc. In light of this aspiration, researchers have persevered vigorously to design and develop efficient schemes for different data mining tasks. Clustering is an important data mining task. In this thesis our focus is on clustering.
This thesis outlines, experiments, discusses, and presents the efficient implementation of several clustering techniques executable on a modern Graphics Processing Unit (GPU) of a typical desk-top computer. The characteristics of clustering algorithms that lead to speed and scalable performance are identified. Aspects of parallelizing computations which are highly repetitive, arithmetically intense and data independent are analyzed and a general model that can be used to parallelize data mining algorithms is proposed. At the end of each experimental chapter we have listed the lessons learnt from the exploration and analysis conducted through the study so as to benefit the data mining community at large.
GPU manufacturers such as Nvidia Corporation keep adding features to their chips to facilitate computing, but in parallel there is a continued effort required by programmers to access these features using either Open Graphics Language (OpenGL) or DirectX as graphics languages. Researchers in the field of computing continue to camouflage their computations as graphics problems, writing their computations in a graphics-oriented shading language such as OpenGL’s Graphics Library Shading Language (GLSL) or Microsoft’s High Level Shader Language (HLSL). Compared to DirectX, OpenGL turned out to be a better choice for our initial research on general-purpose computations due to various reasons. In order to maximize the utilization of their GPUs for general-purpose computing, and to reach a number of researchers possible, Nvidia took industry standard C and added graphics programmable key components to it in order to harness some of the special features into Compute Unified Device Architecture (CUDA), though only at the launch of the GeForce 8800 GTX GPU, Nvidia made a compiler for CUDA available to the public. Thus CUDA became the first architecture specifically designed by a GPU manufacturer to enable general-purpose computing on GPUs. In the first part of the thesis, we compare the CPU based computing architectures to the GPU based computing architectures on desktop systems, explore the usage of existing graphics programming languages for general-purpose computing and we strategize to extend it to data clustering. We chose OpenGL as the suitable platform to implement clustering algorithms and explain the concepts through GPU based sorting and reduction techniques. We identify the challenges and performance issues in using GPU for clustering.
In the second part of this thesis, we address performance issues such as utilization of GPU texture arrays, GPU – CPU communications, use of Shaders etc. in the implementation of the non-overlapping partition k-means clustering algorithm on the GPU. The ratio between the computational time taken by the CPU and the computational time taken by the GPU time gives the speed gain. We demonstrate remarkable speed gain of 12 to 64 times on the Nvidia 5900 GPU and about 20 to 140 speed gain on the Nvidia 8500 GPU in computational time per iteration for evaluations with various cluster sizes. Further we develop and implement another popular partitioned and overlapping clustering algorithm called fuzzy c-means (FCM) on the GPU. Astounding speed gains up to 140 times on GPU 8800GTX and up to 73 times on GPU 8500GT are realized using real-life microarray gene expressions.
In the third part of this thesis, we explore the use of Nvidia’s Compute Unified Device Architecture (CUDA) based framework to develop efficient low-cost, high performance implementation of the Hierarchical Agglomerative Clustering (HAC) and Partially Overlapping Partitioning (PoP) based HAC on the GPU. We address the various research issues on data clustering with CUDA. Speed gains about 15 to 90 times are achieved for the traditional HAC and up to 442 times for the PoP HAC for various combinations of cluster parameters.
In the last part of this thesis discuss characteristics of economic yet efficient parallel computing architectures for clustering algorithms in desktop computer using its GPU and propose a general model which can be applied to parallelize future data mining computations. |
---|