Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Efficient Parallel Turbo-Decoding for High-Throughput Wireless Systems

[...]

Christoph Roth¹, Sandro Belfanti¹, Christian Benkeser, Qiuting Huang¹•Institutions (1)

ETH Zurich¹

06 Jan 2014-IEEE Transactions on Circuits and Systems

TL;DR: A new SISO-decoder architecture is proposed that leads to significant throughput gains and better hardware efficiency compared to existing architectures for the full range of code rates.

...read moreread less

Abstract: Turbo decoders for modern wireless communication systems have to support high throughput over a wide range of code rates. In order to support the peak throughputs specified by modern standards, parallel turbo-decoding has become a necessity, rendering the corresponding VLSI implementation a highly challenging task. In this paper, we explore the implementation trade-offs of parallel turbo decoders based on sliding-window soft-input soft-output (SISO) maximum a-posteriori (MAP) component decoders. We first introduce a new approach that allows for a systematic throughput comparison between different SISO-decoder architectures, taking their individual trade-offs in terms of window length, error-rate performance and throughput into account. A corresponding analysis of existing architectures clearly shows that the latency of the sliding-window SISO decoders causes diminishing throughput gains with increasing degree of parallelism. In order to alleviate this parallel turbo-decoder predicament, we propose a new SISO-decoder architecture that leads to significant throughput gains and better hardware efficiency compared to existing architectures for the full range of code rates.

...read moreread less

39 citations

Journal Article•DOI•

Frog: Asynchronous Graph Processing on GPU with Hybrid Coloring Model

[...]

Xuanhua Shi¹, Xuan Luo¹, Junling Liang¹, Peng Zhao¹, Sheng Di², Bingsheng He³, Hai Jin¹ - Show less +3 more•Institutions (3)

Huazhong University of Science and Technology¹, Argonne National Laboratory², National University of Singapore³

01 Jan 2018-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This work proposes a light-weight asynchronous processing framework called Frog with a preprocessing/hybrid coloring model based on the Pareto principle about coloring algorithms, and finds that a majority of vertices are colored with only a few colors, such that they can be read and updated in a very high degree of parallelism without violating the sequential consistency.

...read moreread less

Abstract: GPUs have been increasingly used to accelerate graph processing for complicated computational problems regarding graph theory. Many parallel graph algorithms adopt the asynchronous computing model to accelerate the iterative convergence. Unfortunately, the consistent asynchronous computing requires locking or atomic operations, leading to significant penalties/overheads when implemented on GPUs. As such, the coloring algorithm is adopted to separate the vertices with potential updating conflicts, guaranteeing the consistency/correctness of the parallel processing. Common coloring algorithms, however, may suffer from low parallelism because of a large number of colors generally required for processing a large-scale graph with billions of vertices. We propose a light-weight asynchronous processing framework called Frog with a preprocessing/hybrid coloring model. The fundamental idea is based on the Pareto principle (or 80-20 rule) about coloring algorithms as we observed through masses of real-world graph coloring cases. We find that a majority of vertices (about 80 percent) are colored with only a few colors, such that they can be read and updated in a very high degree of parallelism without violating the sequential consistency. Accordingly, our solution separates the processing of the vertices based on the distribution of colors. In this work, we mainly answer three questions: (1) how to partition the vertices in a sparse graph with maximized parallelism, (2) how to process large-scale graphs that cannot fit into GPU memory, and (3) how to reduce the overhead of data transfers on PCIe while processing each partition. We conduct experiments on real-world data (Amazon, DBLP, YouTube, RoadNet-CA, WikiTalk, and Twitter) to evaluate our approach and make comparisons with well-known non-preprocessed (such as Totem, Medusa, MapGraph, and Gunrock) and preprocessed (Cusha) approaches, by testing four classical algorithms (BFS, PageRank, SSSP, and CC). On all the tested applications and datasets, Frog is able to significantly outperform existing GPU-based graph processing systems except Gunrock and MapGraph. MapGraph gets better performance than Frog when running BFS on RoadNet-CA. The comparison between Gunrock and Frog is inconclusive. Frog can outperform Gunrock more than 1.04X when running PageRank and SSSP, while the advantage of Frog is not obvious when running BFS and CC on some datasets especially for RoadNet-CA.

...read moreread less

38 citations

Proceedings Article•DOI•

Synthesis of VHDL concurrent processes

[...]

Petru Eles, Marius Minea¹, Krzysztof Kuchcinski², Zebo Peng²•Institutions (2)

Carnegie Mellon University¹, Linköping University²

23 Sep 1994

TL;DR: Two methods for synthesis of VHDL specifications containing concurrent processes are presented to preserve simulation/synthesis correspondence during high-level synthesis and to produce hardware that operates with a high degree of parallelism.

...read moreread less

Abstract: This paper presents two methods for synthesis of VHDL specifications containing concurrent processes. Our main objective is to preserve simulation/synthesis correspondence during high-level synthesis and to produce hardware that operates with a high degree of parallelism. The first method supports an unrestricted use of signals and wait statements and synthesizes synchronous hardware with global control of process synchronization for signal update. The second method allows hardware synthesis without the strict synchronization imposed by the VHDL simulation cycle. Experimental results have shown that the proposed methods are efficient for a wide spectrum of digital systems.

...read moreread less

38 citations

Journal Article•DOI•

Modeling performances of concurrent big data applications

[...]

Aniello Castiglione¹, Marco Gribaudo², Mauro Iacono³, Francesco Palmieri³•Institutions (3)

University of Salerno¹, Polytechnic University of Milan², Seconda Università degli Studi di Napoli³

01 Aug 2015-Software - Practice and Experience

TL;DR: An analytic modeling technique based on the use of Markovian Agents and Mean Field Analysis is proposed that allows the effective description of different concurrent Big Data applications on a same, multi‐site cloud infrastructure, accounting for mutual interactions, in order to support the careful evaluation of several elements in terms of real costs/risks/benefits.

...read moreread less

Abstract: Big Data applications are characterized by a non-negligible number of complex parallel transactions on a huge amount of data that continuously varies, generally increasing over time. Because of the amount of needed resources, the ideal runtime scenario for these applications is based on complex cloud computing and storage infrastructures, providing a scalable degree of parallelism together with isolation between different applications and resource abstraction. However, such additional abstraction degree also introduces significant complexity in performance modeling and decision making. Potential concurrency of many applications on the same cloud infrastructure has to be evaluated, and, simultaneously, scalability of applications over time has to be studied through proper modeling practices, in order to predict the system behavior as the usage patterns evolve and the load increases. For this purpose, in this paper, we propose an analytic modeling technique based on the use of Markovian Agents and Mean Field Analysis that allows the effective description of different concurrent Big Data applications on a same, multi-site cloud infrastructure, accounting for mutual interactions, in order to support the careful evaluation of several elements in terms of real costs/risks/benefits for correctly dimensioning and allocating the resources and verifying the existing service level agreements. Copyright © 2014 John Wiley & Sons, Ltd.

...read moreread less

38 citations

Journal Article•DOI•

Materials for optical information processing.

[...]

A. M. Glass¹•Institutions (1)

Bell Labs¹

09 Nov 1984-Science

TL;DR: The goal for optical information processing is to use the unique characteristics of light which are not readily achieved with electronic devices, namely, ultrahigh speed (picoseconds), a high degree of parallelism (image processing), and conductor-free interconnection.

...read moreread less

Abstract: The goal for optical information processing is to use the unique characteristics of light which are not readily achieved with electronic devices, namely, ultrahigh speed (picoseconds), a high degree of parallelism (image processing), and conductor-free interconnection. The requirements of the nonlinear materials to perform such functions, using all-optical interactions, are discussed and the limitations of the nonlinear mechanisms are outlined.

...read moreread less

38 citations

Collapse

Network Information

Performance

Metrics

1,515

Papers

27,447

Citations

No. of papers in the topic in previous years
Year	Papers
2022	1
2021	47
2020	48
2019	52
2018	70
2017	75

Degree of parallelism

Papers published on a yearly basis

Papers

Trending Questions (7)

Network Information

Related Topics (5)

Performance

Metrics