HPC simulations of information propagation over complex networks

Simulation provides a flexible and valuable method to study the behavior of information propagation over complex networks. High Performance Computing (HPC) is a technology that may be used to apply optimized algorithms on powerful new hardware resources for accelerating execution performance. With t...

Full description

Saved in:
Bibliographic Details
Main Author: Jin, Jiangming
Other Authors: Lee Bu Sung, Francis
Format: Theses and Dissertations
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/62061
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Simulation provides a flexible and valuable method to study the behavior of information propagation over complex networks. High Performance Computing (HPC) is a technology that may be used to apply optimized algorithms on powerful new hardware resources for accelerating execution performance. With the increased computing resources required for simulation of large-scale networks, it is attractive to apply HPC techniques to promote the simulation performance both in the temporal and spatial domains. This thesis describes a simulation methodology that can facilitate simulations of information propagation in HPC environments. The novelty of our approach comes from the way we provide optimized simulation strategies and integrate such strategies with emerging computing architectures, including both Multi-core CPU and Many-core GPGPU for parallel performance acceleration. As a motivational example, an agent-based infectious disease simulation is used to illustrate the detailed performance modeling and runtime algorithmic adaptation during simulation, both in a serial processing environment and in a Multi-core CPU parallel processing environment. Our observations indicate that a significant performance gain can be achieved based on algorithmic adaptation at runtime. However, the experimental results show that the number of CPU cores limits the scalability of parallel processing, and the OpenMP overhead in creating and destroying threads is a major factor in the bottleneck of parallel execution. Inspired by the motivational example, we can construct a general simulation methodology with optimized execution strategies. We introduce two general simulation algorithms derived from two widely used information propagation models. In addition, two types of data structure can be defined for a given network, named the Vertex-Oriented Structure, and the Edge-Oriented Structure. According to these two data structures, we can design two types of execution approaches: Vertex-Oriented Processing and Edge-Oriented Processing. With vertex-oriented processing, we can implement an adaptive simulation algorithm which enables different algorithms to be interchanged at runtime in order to obtain execution performance benefits. Using performance modeling and prediction, we are able to derive accurate rules for the runtime algorithmic adaptation. According to the proposed simulation strategies, we implement the general simulation algorithms for information propagation on both Multi-core CPU and Many-core GPGPU. We adopt identical data structures and the same random number generation library on GPU and CPU platforms in order to compare and analyze the performance. Both vertex-oriented processing and edge-oriented processing are comprehensively investigated on both Random and Scale-free networks. The performance benefits of vertex-oriented and edge-oriented processing are compared and analyzed. Because of the limited size of GPU memory, the single GPU execution algorithms are further extended to multiple GPU devices. The system design is illustrated in detail, including the network partitioning and replication strategy, and the data synchronization scheme. Furthermore, the performance of distributed simulation across multiple GPUs is also investigated in detail. Finally, as a case study, this thesis describes studies of viral advertisement diffusion using our simulation strategies on CPU and multiple GPUs. The advertisement diffusion behaviors influenced by varying thresholds and different initial nodes selection policies are investigated. According to the experimental results, we can observe that the largest degree selection policy and a large number of initial selected nodes can help in maximizing the diffusion effect in the beginning and early stages of advertisement diffusion. However, the influence of a good node selection policy in advertisement diffusion is not that crucial compared to reducing the threshold. The conclusions given by the simulation studies are comparable with real cases in marketing studies.