Exploiting the parallelism of large-scale application-layer networks by adaptive GPU-based simulation
Philipp Andelfinger,Hannes Hartenstein +1 more
- pp 3471-3482
Reads0
Chats0
TLDR
A GPU-based simulator engine that performs all steps of large-scale network simulations on a commodity many-core GPU and adapts its configuration at runtime in order to balance parallelism and overheads to achieve high performance for a given network model and scenario is presented.Abstract:
We present a GPU-based simulator engine that performs all steps of large-scale network simulations on a commodity many-core GPU. Overhead is reduced by avoiding unnecessary data transfers between graphics memory and main memory. On the example of a widely deployed peer-to-peer network, we analyze the parallelism in large-scale application-layer networks, which suggests the use of thousands of concurrent processor cores for simulation. The proposed simulator employs the vast number of parallel cores in modern GPUs to exploit the identified parallelism and enables substantial simulation speedup. The simulator adapts its configuration at runtime in order to balance parallelism and overheads to achieve high performance for a given network model and scenario. A performance evaluation for simulations of networks comprising up to one million peers demonstrates a speedup of up to 19.5 compared with an efficient sequential implementation and shows the effectiveness of the runtime adaptation to different network conditions.read more
Citations
More filters
Proceedings ArticleDOI
Simian integrated framework for parallel discrete event simulation on GPUs
TL;DR: A new framework integrated in the Simian engine is presented, which allows to make efficient use of GPUs for computationally intense sections of code and allows modellers to offset some or all handlers to the GPU by efficiently grouping and scheduling these handlers.
Journal ArticleDOI
Synchronous speculative simulation of tightly coupled agents in continuous time on CPUs and GPUs
TL;DR: In this article , the authors propose synchronous optimistic synchronization algorithms tailored toward simulations of fine-grained interactions among tightly coupled agents in highly dynamic topologies and present implementations targeting multicore central processing units (CPUs) as well as many-core GPUs.
Proceedings ArticleDOI
Model-Based Concurrency Analysis of Network Simulations
TL;DR: An analytical model is proposed that enables concurrency estimations based on model knowledge and on statistics gathered from sequential simulation runs, and enables insights into the relationship between the topology and communication patterns of the simulated network, and the resulting concurrency.
Proceedings ArticleDOI
Advanced Tutorial: Parallel and Distributed Methods for Scalable Discrete Simulation
Philipp Andelfinger,Wentong Cai +1 more
TL;DR: In this article , the fundamental notions of parallel and distributed simulation and synchronization algorithms are described and summarized under the constraints of the domains of transportation and spiking neural networks, and current research directions and challenges are discussed in light of the tension between efficiency through specialization and wide applicability through generalization.
Journal ArticleDOI
Transitioning Spiking Neural Network Simulators to Heterogeneous Hardware
TL;DR: This article proposes a transition approach for CPU-based SNN simulators to enable the execution on heterogeneous hardware, with only limited modifications to an existing simulator code base and without changes to model code.
References
More filters
Book ChapterDOI
Kademlia: A Peer-to-Peer Information System Based on the XOR Metric
Petar Maymounkov,David Mazières +1 more
TL;DR: In this paper, the authors describe a peer-to-peer distributed hash table with provable consistency and performance in a fault-prone environment, which routes queries and locates nodes using a novel XOR-based metric topology.
Proceedings ArticleDOI
Inter-block GPU communication via fast barrier synchronization
Shucai Xiao,Wu-chun Feng +1 more
TL;DR: This work proposes two approaches for inter-block GPU communication via barrier synchronization: GPU lock-based synchronization andGPU lock-free synchronization and evaluates the efficacy of each approach via a micro-benchmark as well as three well-known algorithms — Fast Fourier Transform, dynamic programming, and bitonic sort.
Journal ArticleDOI
The cost of conservative synchronization in parallel discrete event simulations
TL;DR: It is shown that on large problems—those for which parallel processing is ideally suited— there is often enough parallel workload so that processors are not usually idle, and the method is within a constant factor of optimal.
Efficient Parallel Scan Algorithms for GPUs
TL;DR: This paper describes the design of ecient scan and segmented scan parallel primitives in CUDA for execution on GPUs using a divide-and-conquer approach and demonstrates that this design methodology results in routines that are simple, highly ecient, and free of irregular access patterns that lead to memory bank conicts.
Proceedings ArticleDOI
Discrete-event Execution Alternatives on General Purpose Graphical Processing Units (GPGPUs)
TL;DR: Initial performance results on simulation of a diffusion process show that DES-style execution on GPGPU runs faster than DES on CPU and also significantly faster than time-stepped simulations on either CPU or GPG PU.
Related Papers (5)
Parallel GPU architecture simulation framework exploiting work allocation unit parallelism
Sangpil Lee,Won Woo Ro +1 more
Distributed time, conservative parallel logic simulation on GPUs
Bo Wang,Yuhao Zhu,Yangdong Deng +2 more
Pipe-Torch: Pipeline-Based Distributed Deep Learning in a GPU Cluster with Heterogeneous Networking
Jun Zhan,Jinghui Zhang +1 more