Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Adaptive Transactional Memories: Performance and Energy Consumption Tradeoffs

[...]

Diego Rughetti¹, Pierangelo Di Sanzo¹, Alessandro Pellegrini¹•Institutions (1)

Sapienza University of Rome¹

05 Feb 2014

TL;DR: The results hereby provided show that adaptively is a strictly necessary requirement to reduce energy consumption in STM systems: Without it, it is not possible to reach any acceptable level of energy efficiency at all.

...read moreread less

Abstract: Energy efficiency is becoming a pressing issue, especially in large data centers where it entails, at the same time, a non-negligible management cost, an enhancement of hardware fault probability, and a significant environmental footprint. In this paper, we study how Software Transactional Memories (STM)can provide benefits on both power saving and the overall applications' execution performance. This is related to the fact that encapsulating shared-data accesses within transactions gives the freedom to the STM middleware to both ensure consistency and reduce the actual data contention, the latter having been shown to affect the overall power needed to complete the application's execution. We have selected a set of self-adaptive extensions to existing STM middle wares (namely, TinySTM and R-STM) to prove how self-adapting computation can capture the actual degree of parallelism and/or logical contention on shared data in a better way, enhancing even more the intrinsic benefits provided by STM. Of course, this benefit comes at a cost, which is the actual execution time required by the proposed approaches to precisely tune the execution parameters for reducing power consumption and enhancing execution performance. Nevertheless, the results hereby provided show that adaptively is a strictly necessary requirement to reduce energy consumption in STM systems: Without it, it is not possible to reach any acceptable level of energy efficiency at all.

...read moreread less

10 citations

Dissertation•

Array Dataflow Analysis in Presence of Non-affine Constraints

[...]

Denis Barthou

23 Feb 1998

TL;DR: A new polynomial-time algorithm is described, outperforming other current methods in terms of both complexity and application domain, and a general framework so as to handle any kind of dependences, by possibly producing approximate dependences is presented.

...read moreread less

Abstract: Array dataflow dependence analysis is paramount for automatic parallelization The description of dependences at the operation and array element level has been shown to improve significantly the output of many code optimizations But this kind of analysis has two main issues: its high cost and its scope limited to a small number of programs We first describe a new polynomial-time algorithm, outperforming other current methods in terms of both complexity and application domain Then, in the continuity of the work done by J-F Collard, we present a general framework so as to handle any kind of dependences, by possibly producing approximate dependences The model of programs is extended to any reducible control graph and any kind of references to array elements An original method called iterative analysis, finds relations between non-affine constraints so as to improve the accuracy of the method Besides, we provide a criterion ensuring that the approximation obtained is the best with respect to the information gathered on non-affine constraints by other analyses Finally, several traditional applications of dataflow analyses are adapted to our method in order to take advantage of its results, and we detail more specifically an array expansion that is a trade-off between run-time overhead, memory requirement and degree of parallelism

...read moreread less

10 citations

Book Chapter•DOI•

RoCL: A Resource Oriented Communication Library

[...]

Albano Alves¹, António Pina², José Exposto¹, José Rufino¹•Institutions (2)

Instituto Politécnico Nacional¹, University of Minho²

26 Aug 2003

TL;DR: RoCL is a communication library that aims to exploit the low-level communication facilities of today’s cluster networking hardware and to merge, via the resource oriented paradigm, those facilities and the high-level degree of parallelism achieved on SMP systems through multi-threading.

...read moreread less

Abstract: RoCL is a communication library that aims to exploit the low-level communication facilities of today’s cluster networking hardware and to merge, via the resource oriented paradigm, those facilities and the high-level degree of parallelism achieved on SMP systems through multi-threading.

...read moreread less

10 citations

New computing systems and their impact on structural analysis and design

[...]

Ahmed K. Noor¹•Institutions (1)

Langley Research Center¹

01 Jan 1989

TL;DR: A novel partitioning strategy is outlined for maximizing the degree of parallelism in structural analysis and design that was implemented on the CRAY X-MP/4 and the Alliant FX/8 computers.

...read moreread less

Abstract: A review is given of the recent advances in computer technology that are likely to impact structural analysis and design. The computational needs for future structures technology are described. The characteristics of new and projected computing systems are summarized. Advances in programming environments, numerical algorithms, and computational strategies for new computing systems are reviewed, and a novel partitioning strategy is outlined for maximizing the degree of parallelism. The strategy is designed for computers with a shared memory and a small number of powerful processors (or a small number of clusters of medium-range processors). It is based on approximating the response of the structure by a combination of symmetric and antisymmetric response vectors, each obtained using a fraction of the degrees of freedom of the original finite element model. The strategy was implemented on the CRAY X-MP/4 and the Alliant FX/8 computers. For nonlinear dynamic problems on the CRAY X-MP with four CPUs, it resulted in an order of magnitude reduction in total analysis time, compared with the direct analysis on a single-CPU CRAY X-MP machine.

...read moreread less

10 citations

Proceedings Article•DOI•

A framework for PC applications with portable and scalable FPGA accelerators

[...]

Markus Weinhardt¹, Alexander Krieger¹, Thomas Kinder¹•Institutions (1)

University of Osnabrück¹

01 Dec 2013

TL;DR: A novel framework for implementing portable and scalable data-intensive applications on reconfigurable hardware featuring Field-Programmable Gate Arrays and memory and a new method to automatically select a task's optimal degree of parallelism on an FPGA for a given hardware platform is presented.

...read moreread less

Abstract: This paper presents a novel framework for implementing portable and scalable data-intensive applications on reconfigurable hardware. Instead of using expensive “reconfigurable supercomputers”, we focus our work on standard PCs and PCI-Express extension cards featuring Field-Programmable Gate Arrays (FPGAs) and memory. In our framework, we exploit task-level parallelism by manually partitioning applications into several parallel tasks using a communication API for data streams. This also allows pure software implementations on PCs without FPGA cards. If an FPGA accelerator is present, the same API calls transfer data between the PC's CPU and the FPGA. Then, the tasks implemented in hardware can exploit instruction-level and pipelining parallelsims as well. Furthermore, the framework consists of hardware implementation rules which enable portable and scalable designs. Device specific hardware wrappers hide the FPGA's and board's idiosyncrasies from the application developer. We also present a new method to automatically select a task's optimal degree of parallelism on an FPGA for a given hardware platform, i. e. to generate a hardware design which uses the available communication bandwidth between the PC and the FPGA optimally. Experimental results show the feasibility of our approach.

...read moreread less

10 citations

Collapse

Network Information

Performance

Metrics

1,515

Papers

27,447

Citations

No. of papers in the topic in previous years
Year	Papers
2022	1
2021	47
2020	48
2019	52
2018	70
2017	75

Degree of parallelism

Papers published on a yearly basis

Papers

Trending Questions (7)

Network Information

Related Topics (5)

Performance

Metrics