Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

A parallel shape optimizing load balancer

[...]

Henning Meyerhenke¹, Stefan Schamberger¹•Institutions (1)

University of Paderborn¹

28 Aug 2006

TL;DR: A distributed implementation of a load balancing heuristic for parallel adaptive FEM simulations based on a disturbed diffusion scheme embedded in a learning framework that helps to omit unnecessary computations as well as replace the domain decomposition by an alternative data distribution scheme reducing the communication overhead.

...read moreread less

Abstract: Load balancing is an important issue in parallel numerical simulations. However, state-of-the-art libraries addressing this problem show several deficiencies: they are hard to parallelize, focus on small edge-cuts rather than few boundary vertices, and often produce disconnected partitions. We present a distributed implementation of a load balancing heuristic for parallel adaptive FEM simulations. It is based on a disturbed diffusion scheme embedded in a learning framework. This approach incorporates a high degree of parallelism that can be exploited and it computes well-shaped partitions as shown in previous publications. Our focus lies on improving the condition of the involved matrix and solving the resulting linear systems with local accuracy. This helps to omit unnecessary computations as well as allows to replace the domain decomposition by an alternative data distribution scheme reducing the communication overhead, as shown by experiments with our new MPI based implementation.

...read moreread less

14 citations

Proceedings Article•DOI•

Parallelization of tau-leap coarse-grained Monte Carlo simulations on GPUs

[...]

Lifan Xu¹, Michela Taufer¹, Stuart D. Collins², Dionisios G. Vlachos²•Institutions (2)

University UCINF¹, University of Delaware²

19 Apr 2010

TL;DR: This paper shows how the efficient parallelization of the tau-leap method for GPUs includes the redefinition of its data structures, the redesign of its algorithm, and the selection of the most appropriate degree of parallelism on a single GPU or multiple GPUs.

...read moreread less

Abstract: The Coarse-Grained Monte Carlo (CGMC) method is a multi-scale stochastic mathematical and simulation framework for spatially distributed systems. CGMC simulations are important tools for studying phenomena such as catalysis, crystal growth, surface diffusion, phase transitions on single crystals, and cell membrane receptor dynamics. In parallel CGMC, the tau-leap method is used for parallel simulations that are executed on traditional CPU clusters in a master-slave setting. Unfortunately the communications between master and slaves negatively impact speedup and scalability. In this paper, we explore the potentials of GPUs for the tau-leap method and we present an extensive performance evaluation that leads to the most suitable degree of parallelism for this method under different simulation profiles. We show how the efficient parallelization of the tau-leap method for GPUs includes (1) the redefinition of its data structures, (2) the redesign of its algorithm, and (3) the selection of the most appropriate degree of parallelism (i.e., fine-grained or course-gained) on a single GPU or multiple GPUs. Exceptional performance improvements can thus be achieved for this method.

...read moreread less

14 citations

Proceedings Article•DOI•

Compiling HPC Kernels for the REDEFINE CGRA

[...]

Kavitha T. Madhu¹, Saptarsi Das¹, S. Nalesh¹, S. K. Nandy¹, Ranjani Narayan - Show less +1 more•Institutions (1)

Indian Institute of Science¹

24 Aug 2015

TL;DR: The proposed compilation flow aims at exposing high degree of parallelism in loop nests in HPC application kernels using polyhedral analysis and generates meta-data to effectively utilize the computational resources in HyperCells.

...read moreread less

Abstract: In this paper, we present a compilation flow for HPC kernels on the REDEFINE coarse-grain reconfigurable architecture (CGRA). REDEFINE is a scalable macro-dataflow machine in which the compute elements (CEs) communicate through messages. REDEFINE offers the ability to exploit high degree of coarse-grain and pipeline parallelism. The CEs in REDEFINE are enhanced with reconfigurable macro data-paths called HyperCells that enable exploitation of fine-grain and pipeline parallelism at the level of basic instructions in static dataflow order. Application kernels that exhibit regularity in computations and memory accesses such as affine loop nests benefit from the architecture of HyperCell [1], [2]. The proposed compilation flow aims at exposing high degree of parallelism in loop nests in HPC application kernels using polyhedral analysis and generates meta-data to effectively utilize the computational resources in HyperCells. Memory is explicitly managed through compiler's assistance. We address the compilation challenges such as partitioning with load balancing, mapping and scheduling computations and management of operand data while targeting multiple HyperCells in the REDEFINE architecture. The proposed solution scales well meeting the performance objectives of HPC computing.

...read moreread less

14 citations

Journal Article•DOI•

Hardware optimization and serial implementation of a novel spiking neuron model for the POEtic tissue.

[...]

Oriol Yuguero Torres, Jan Eriksson¹, Juan Manuel Moreno, Alessandro E. P. Villa², Alessandro E. P. Villa¹ - Show less +1 more•Institutions (2)

University of Lausanne¹, Joseph Fourier University²

01 Aug 2004-BioSystems

TL;DR: The hardware implementation of a spiking neuron model, which uses a spike time dependent plasticity (STDP) rule that allows synaptic changes by discrete time steps, is described and the serial implementation has been realized.

...read moreread less

Abstract: In this paper we describe the hardware implementation of a spiking neuron model, which uses a spike time dependent plasticity (STDP) rule that allows synaptic changes by discrete time steps. For this purpose an integrate-and-fire neuron is used with recurrent local connections. The connectivity of this model has been set to 24 neighbours, so there is a high degree of parallelism. After obtaining good results with the hardware implementation of the model, we proceed to simplify this hardware description, trying to keep the same behaviour. Some experiments using dynamic grading patterns have been used in order to test the learning capabilities of the model. Finally, the serial implementation has been realized.

...read moreread less

14 citations

Book Chapter•DOI•

A Methodology for the Formal Analysis of Asynchronous Micropipelines

[...]

Antonio Cerone¹, George J. Milne²•Institutions (2)

University of Queensland¹, University of South Australia²

01 Nov 2000

TL;DR: In this article, the authors present a process algebra approach for the integrated verification of correctness and performance in concurrent systems, which is entirely performed within the Circal process algebra, without any recourse to other formalisms.

...read moreread less

Abstract: In this paper we present a process algebra approach for the integrated verification of correctness and performance in concurrent systems. The verification procedure is entirely performed within the Circal process algebra, without any recourse to other formalisms. Performance is characterised in terms of logical properties, which do not incorporate explicit time. Such properties are then interpreted in terms of degree of parallelism and allow the quantitative evaluation of the throughput of the system. The approach has been applied to two four-phase handshaking protocols, which are motivated by the implementation of the AMULET2 asynchronous RISC processor. Both correctness and performance properties are captured in the same verification framework and automatically proved using the Circal System.

...read moreread less

14 citations

Collapse

Network Information

Performance

Metrics

1,515

Papers

27,447

Citations

No. of papers in the topic in previous years
Year	Papers
2022	1
2021	47
2020	48
2019	52
2018	70
2017	75

Degree of parallelism

Papers published on a yearly basis

Papers

Trending Questions (7)

Network Information

Related Topics (5)

Performance

Metrics