scispace - formally typeset
Search or ask a question
Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.


Papers
More filters
Proceedings ArticleDOI
Jens Teubner1, Rene Mueller2
12 Jun 2011
TL;DR: This work presents handshake join, a way of describing and executing window-based stream joins that is highly amenable to parallelized execution and gives a new intuition of window semantics, which it believes could inspire other stream processing algorithms or ongoing standardization efforts for stream query languages.
Abstract: In spite of the omnipresence of parallel (multi-core) systems, the predominant strategy to evaluate window-based stream joins is still strictly sequential, mostly just straightforward along the definition of the operation semantics.In this work we present handshake join, a way of describing and executing window-based stream joins that is highly amenable to parallelized execution. Handshake join naturally leverages available hardware parallelism, which we demonstrate with an implementation on a modern multi-core system and on top of field-programmable gate arrays (FPGAs), an emerging technology that has shown distinctive advantages for high-throughput data processing.On the practical side, we provide a join implementation that substantially outperforms CellJoin (the fastest published result) and that will directly turn any degree of parallelism into higher throughput or larger supported window sizes. On the semantic side, our work gives a new intuition of window semantics, which we believe could inspire other stream processing algorithms or ongoing standardization efforts for stream query languages.

144 citations

Proceedings ArticleDOI
17 Nov 1996
TL;DR: This paper presents parallel algorithms for data mining of association rules, and studies the degree of parallelism, synchronization, and data locality issues on the SGI Power Challenge shared-memory multi-processor.
Abstract: Data mining is an emerging research area, whose goal is to extract significant patterns or interesting rules from large databases. High-level inference from large volumes of routine business data can provide valuable information to businesses, such as customer buying patterns, shelving criterion in supermarkets and stock trends. Many algorithms have been proposed for data mining of association rules. However, research so far has mainly focused on sequential algorithms. In this paper we present parallel algorithms for data mining of association rules, and study the degree of parallelism, synchronization, and data locality issues on the SGI Power Challenge shared-memory multi-processor. We further present a set of optimizations for the sequential and parallel algorithms.Experiments show that a significant improvement of performance is achieved using our proposed optimizations. We also achieved good speed-up for the parallel algorithm, but we observe a need for parallel I/O techniques for further performance gains.

143 citations

Proceedings ArticleDOI
01 May 1993
TL;DR: The goal is to quantify the floating point, memory, I/O and communication requirements of highly parallel scientific applications that perform explicit communication and develop analytical models for the effects of changing the problem size and the degree of parallelism.
Abstract: This paper studies the behavior of scientific applications running on distributed memory parallel computers. Our goal is to quantify the floating point, memory, I/O and communication requirements of highly parallel scientific applications that perform explicit communication. In addition to quantifying these requirements for fixed problem sizes and numbers of processors, we develop analytical models for the effects of changing the problem size and the degree of parallelism for several of the applications. We use the results to evaluate the trade-offs in the design of multicomputer architectures.

141 citations

Journal ArticleDOI
TL;DR: This work compares two real-time architectures developed using FPGA and GPU devices for the computation of phase-based optical flow, stereo, and local image features (energy, orientation, and phase) and provides suggestions for selecting the most suitable technology.
Abstract: Low-level computer vision algorithms have extreme computational requirements. In this work, we compare two real-time architectures developed using FPGA and GPU devices for the computation of phase-based optical flow, stereo, and local image features (energy, orientation, and phase). The presented approach requires a massive degree of parallelism to achieve real-time performance and allows us to compare FPGA and GPU design strategies and trade-offs in a much more complex scenario than previous contributions. Based on this analysis, we provide suggestions to real-time system designers for selecting the most suitable technology, and for optimizing system development on this platform, for a number of diverse applications.

138 citations

Journal ArticleDOI
TL;DR: This paper addresses the problem of efficiently computing the motor torques required to drive a manipulator arm in free motion, given the desired trajectory—that is the inverse dynamics problem and presents two "mathemati cally exact "formulations especially suited to high-speed, highly parallel implementations using VLSI devices.
Abstract: This paper addresses the problem of efficiently computing the motor torques required to drive a manipulator arm in free motion, given the desired trajectory—that is the inverse dynamics problem. It analyzes the high degree of parallelism inherent in the computations and presents two "mathemati cally exact "formulations especially suited to high-speed, highly parallel implementations using VLSI devices. The first method presented is a parallel version of the recent linear Newton-Euler recursive algorithm. The time cost is linear in the number of joints, but the real-time coefficients are re duced by almost two orders of magnitude. The second formu lation reports a new parallel algorithm that shows that it is possible to improve on the linear time dependency. The real time required to perform the calculations increases only as the [log2] of the number of joints. Either formulation is sus ceptible to a systolic pipelined architecture in which complete sets of joint torques emerge at successive intervals of f...

136 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
85% related
Scheduling (computing)
78.6K papers, 1.3M citations
83% related
Network packet
159.7K papers, 2.2M citations
80% related
Web service
57.6K papers, 989K citations
80% related
Quality of service
77.1K papers, 996.6K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20221
202147
202048
201952
201870
201775