scispace - formally typeset
Search or ask a question
Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The existence of an optimum degree of parallelism ((τopt) for which the best performance, in terms of efficiency and number of iterations, and effectiveness is obtained is obtained.

12 citations

Proceedings ArticleDOI
07 Nov 2012
TL;DR: This work studies the performance of parallel local search for SAT with a large degree of parallelism, up to 256 cores, and compares various cooperation strategies.
Abstract: Parallel portfolio-based algorithms have become a standard methodology for both complete and incomplete solvers for SAT solving. In this methodology several algorithms explore the search space in parallel, either independently or cooperatively with some communication between the solvers. Unlike previous work where parallel algorithms are limited to few cores (usually up to 16 cores), this work studies the performance of parallel local search for SAT with a large degree of parallelism, up to 256 cores, and compares various cooperation strategies. The strategy with the best performance consists in considering small groups of solvers (e.g. 4 or 8) sharing information and performing no inter-group communication.

12 citations

Proceedings ArticleDOI
Lin Hu1, Lei Zou1, Yu Liu1
09 Jun 2021
TL;DR: Wang et al. as mentioned in this paper propose a novel lightweight graph preprocessing method to boost many state-of-the-art GPU triangle counting algorithms without changing their implementations and data structures, and find common computing patterns in existing algorithms, and abstract two analytic models to measure how workload imbalance and diversity in these computing patterns affect performance exactly.
Abstract: Triangle counting is an important problem in graph mining, which has achieved great performance improvement on GPU in recent years. Instead of proposing a new GPU triangle counting algorithm, in this paper, we propose a novel lightweight graph preprocessing method to boost many state-of-the-art GPU triangle counting algorithms without changing their implementations and data structures. Specifically, we find common computing patterns in existing algorithms, and abstract two analytic models to measure how workload imbalance and diversity in these computing patterns affect performance exactly. Then, due to the NP-hardness of the model optimization, we propose approximate solutions by determining edge directions to balance workloads and reordering vertices to maximize the degree of parallelism within GPU blocks. Finally, extensive experiments confirm the significant performance improvement and high usability of our approach.

12 citations

Journal ArticleDOI
TL;DR: An algorithm for 2-D convolution that explicitly takes into account the boundary conditions is presented and allows a large image to be partitioned so that each partition may be processed by independent 2- D convolvers.
Abstract: An image is regarded as a 2-D array of pixels and is processed by a 2-D array architecture. The image can be acquired in the usual manner by a raster scan method which produces a 1-D array of pixels at real-time video rates. Two 2-D systolic arrays for a 2-D convolver are presented. They have an architecture which accepts this 1-D array of pixels and processes them in a 2-D array of simple processors. This high degree of parallelism is achieved through matrix-vector formulations of 2-D convolution. One array has a serial input, a serial output, and uses a minimum number of multipliers: the other array has parallel input, parallel output, and is suitable for high-speed processing using slow processing elements. Both arrays are modular with nearest-neighbor communications are are suitable for VLSI implementation. In addition, an algorithm for 2-D convolution that explicitly takes into account the boundary conditions is presented. This feature allows a large image to be partitioned so that each partition may be processed by independent 2-D convolvers. It is then possible to process only a specified section of the image or carry out high-speed parallel processing using as many 2-D convolvers as are available. >

12 citations

Journal ArticleDOI
TL;DR: An improvement to a parallel implementation of T-Coffee, a widely used MSA package, that resolves the bottleneck of the progressive alignment stage on MSA and shows improvements in execution time of over 68% while maintaining the biological accuracy.
Abstract: Multiple Sequence Alignment (MSA) constitutes an extremely powerful tool for important biological applications such as phylogenetic analysis, identification of conserved motifs and domains and structure prediction. In spite of the improvement in speed and accuracy introduced by MSA programs, the computational requirements for large-scale alignments requires high-performance computing and parallel applications. In this paper we present an improvement to a parallel implementation of T-Coffee, a widely used MSA package. Our approximation resolves the bottleneck of the progressive alignment stage on MSA. This is achieved by increasing the degree of parallelism by balancing the guide tree that drives the progressive alignment process. The experimental results show improvements in execution time of over 68% while maintaining the biological accuracy.

12 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
85% related
Scheduling (computing)
78.6K papers, 1.3M citations
83% related
Network packet
159.7K papers, 2.2M citations
80% related
Web service
57.6K papers, 989K citations
80% related
Quality of service
77.1K papers, 996.6K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20221
202147
202048
201952
201870
201775