Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Chapter 26 – Parallelization of a nearshore wind wave model for distributed memory architectures

[...]

Marcel Zijlema¹•Institutions (1)

Delft University of Technology¹

01 Jan 1996

TL;DR: This chapter discusses parallelization of the sequential code SWAN (simulating waves nearshore) on distributed memory architectures using MPI for simulating wind-generated waves in coastal regions using the third-generation wave model SWAN.

...read moreread less

Abstract: Publisher Summary This chapter discusses parallelization of the sequential code SWAN (simulating waves nearshore) on distributed memory architectures using MPI for simulating wind-generated waves in coastal regions. Efficient parallel algorithms are required to calculate spectra of random short-crested, wind-generated waves in coastal regions using the third-generation wave model SWAN. The propagation schemes used in SWAN are fully implicit, so that they can be utilized for computing waves in shallow water. Two strategies for parallelizing these schemes are presented: (1) the block Jacobi approximation with a high degree of parallelism, and (2) the block wavefront approach that is to a large extent parallelizable. Contrary to the first one, the latter has the same behavior as the sequential method with respect to convergence. Numerical experiments are run on a dedicated Beowulf cluster with a real-life application. They show that good speedups have been achieved with the block wavefront approach, as long as the computational domain is not divided into too thin slices. Concerning the block Jacobi method, a considerable decline in performance is observed, which is attributable to the numerical overhead arising from tripling the number of iterations.

...read moreread less

8 citations

Journal Article•DOI•

Increasing the degree of parallelism using speculative execution in task-based runtime systems.

[...]

Bérenger Bramas¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

18 Mar 2019-PeerJ

TL;DR: This paper proposes using speculation to unleash the parallelism when it is uncertain if some tasks will modify data, and formalizes a new methodology to enable speculative execution in a graph of tasks.

...read moreread less

Abstract: Task-based programming models have demonstrated their efficiency in the development of scientific applications on modern high-performance platforms. They allow delegation of the management of parallelization to the runtime system (RS), which is in charge of the data coherency, the scheduling, and the assignment of the work to the computational units. However, some applications have a limited degree of parallelism such that no matter how efficient the RS implementation, they may not scale on modern multicore CPUs. In this paper, we propose using speculation to unleash the parallelism when it is uncertain if some tasks will modify data, and we formalize a new methodology to enable speculative execution in a graph of tasks. This description is partially implemented in our new C++ RS called SPETABARU, which is capable of executing tasks in advance if some others are not certain to modify the data. We study the behavior of our approach to compute Monte Carlo and replica exchange Monte Carlo simulations.

...read moreread less

8 citations

Journal Article•DOI•

Spin-Hall-Effect-Based Stochastic Number Generator for Parallel Stochastic Computing

[...]

Jiaxi Hu¹, Bingzhe Li¹, Cong Ma¹, David J. Lilja¹, Steven J. Koester¹ - Show less +1 more•Institutions (1)

University of Minnesota¹

19 Jun 2019-IEEE Transactions on Electron Devices

TL;DR: A scalable SNG based on the spin-Hall-effect (SHE), which is capable of generating multiple independent stochastic streams simultaneously, and takes advantages of the efficient charge-to-spin conversion from the Spin-Hall material and the intrinsic Stochasticity of nanomagnets.

...read moreread less

Abstract: Stochastic computing (SC) is a promising technology that can be used for low-cost hardware designs. However, SC suffers from its long latency. Although parallel processing can efficiently shorten the latency, duplicated stochastic number generators (SNGs) are necessary, which cause substantial hardware overhead. This paper proposes a scalable SNG based on the spin-Hall-effect (SHE), which is capable of generating multiple independent stochastic streams simultaneously. The design takes advantages of the efficient charge-to-spin conversion from the Spin-Hall material and the intrinsic stochasticity of nanomagnets. Compared to previous spintronic SNGs, the SHE-SNG can reduce the area by $1.6\times -7.8\times $ and the power by $4.9\times -13\times $ while increasing the degree of parallelism from 1 to 16. Compared to CMOS-based SNGs, the proposed SNG obtained $24\times -120\times $ and $53\times $ reduction in terms of area and power, respectively. Finally, three benchmarks were implemented, and the results indicate that SC implementations with the proposed SHE-SNG can achieve $1.2\times -29\times $ reduction of hardware resources compared to implementations with previous CMOS- and spintronic-based designs while scaling the degree of parallelism from 1 to 64.

...read moreread less

8 citations

Journal Article•DOI•

Data synchronized pipeline architecture: pipelining in multiprocessor environments

[...]

Yvon Jégou, André Seznec

01 Dec 1986-Journal of Parallel and Distributed Computing

TL;DR: The Data Synchronized Pipeline Architecture (DSPA) as mentioned in this paper allows a high degree of parallelism in the pipeline, even in the case of unforeseeable behaviors of some resource.

...read moreread less

8 citations

Proceedings Article•DOI•

Latency Hiding and Performance Tuning with Graph-Based Execution

[...]

Pietro Cicotti¹, Scott B. Baden¹•Institutions (1)

University of California, San Diego¹

10 Oct 2011

TL;DR: Tarragon, which is based on dataflow, targets latency tolerant scientific computations and achieves high performance, in many cases exceeding the performance of equivalent latency-tolerant, hard coded MPI implementations.

...read moreread less

Abstract: In the current practice, scientific programmer and HPC users are required todevelop code that exposes a high degree of parallelism, exhibits high locality,dynamically adapts to the available resources, and hides communication latency.Hiding communication latency is crucial to realize the potential of today'sdistributed memory machines with highly parallel processing modules, andtechnological trends indicate that communication latencies will continue to bean issue as the performance gap between computation and communication widens.However, under Bulk Synchronous Parallel models, the predominant paradigm inscientific computing, scheduling is embedded into the application code. All thephases of a computation are defined and laid out as a linear sequence ofoperations limiting overlap and the program's ability to adapt to communicationdelays.In this paper we present an alternative model, called Tarragon, to overcome thelimitations of Bulk Synchronous Parallelism. Tarragon, which is based ondataflow, targets latency tolerant scientific computations. Tarragon supports atask-dependency graph abstraction in which tasks, the basic unit ofcomputation, are organized as a graph according to their data dependencies,i.e. task precedence. In addition to the task graph, Tarragon supports metadataabstractions, annotations to the task graph, to express locality informationand scheduling policies to improve performance.Tarragon's functionality and underlying programming methodology aredemonstrated on three classes of computations used in scientific domains:structured grids, sparse linear algebra, and dynamic programming. In theapplication studies, Tarragon implementations achieve high performance, in manycases exceeding the performance of equivalent latency-tolerant, hard coded MPIimplementations.

...read moreread less

8 citations

Collapse

Network Information

Performance

Metrics

1,515

Papers

27,447

Citations

No. of papers in the topic in previous years
Year	Papers
2022	1
2021	47
2020	48
2019	52
2018	70
2017	75

Degree of parallelism

Papers published on a yearly basis

Papers

Trending Questions (7)

Network Information

Related Topics (5)

Performance

Metrics