scispace - formally typeset
Search or ask a question
Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The design and implementation of an efficient reconfigurable parallel prefix computation hardware on field-programmable gate arrays (FPGAs) based on a pipelined dataflow algorithm, and control logic is added to reconfigure the system for arbitrary parallelism degree.
Abstract: This paper presents the design and implementation of an efficient reconfigurable parallel prefix computation hardware on field-programmable gate arrays (FPGAs). The design is based on a pipelined dataflow algorithm, and control logic is added to reconfigure the system for arbitrary parallelism degree. The system receives multiple input streams of elements in parallel and produces output streams in parallel. It has an advantage of controlling the degree of parallelism explicitly at run time. The time complexity of the design is O(d+(N?d)/d), where d and N are parallelism degree and stream size, respectively. When the stream size is sufficiently larger than the initial trigger time of the pipeline (d), the time complexity becomes O(N/d). Unlike the prefix computation circuits found in the literature, the design is scalable for different problem sizes including unknown sized data. The design is modular based on a finite state machine, and implemented and tested for target FPGA devices Xilinx Spartan2S XC2S300EFT256-6Q and XC2S600EFG676-6.

11 citations

Proceedings ArticleDOI
T. Yamauchi1, T. Nakata1, N. Koike1, A. Ishizuka1, N. Nishiguchi1 
11 Nov 1991
TL;DR: The authors describe a novel parallel detailed router named PROTON (parallel router on a parallel machine) with various new features, which include a parallelized line search algorithm based on parallel breadth first search and extraction of a higher degree of parallelism by simultaneous routing of multiple nets using the result of the global router.
Abstract: The authors describe a novel parallel detailed router named PROTON (parallel router on a parallel machine) with various new features. These features include: a parallelized line search algorithm based on parallel breadth first search; extraction of a higher degree of parallelism by simultaneous routing of multiple nets using the result of the global router; a parallel router on a quasi-shared-memory based MIMD parallel machine; and a detailed router supporting multilayer channelless gate arrays with complex industrial design rules. PROTON is implemented on an MIMD parallel machine named Cenju, which consists of 64 microprocessors. In order to improve routing speed, PROTON incorporates two levels of parallelism, namely magnet parallelism and net level parallelism. A speedup of 43 times has been achieved using 64 processors for a medium-scale channelless gate array (1537*1790 grids, 12591 pin pairs). >

11 citations

01 Jan 2011
TL;DR: In this article, a matrix-based geometric multigrid method is proposed to solve finite element solvers with high flexibility with respect to complex geometries and local singularities, which adapts well to the exigences of modern computing platforms.
Abstract: Multigrid methods are efficient and fast solvers for problems typically modeled by partial differential equations of elliptic type. We use the approach of matrix-based geometric multigrid that has high flexibility with respect to complex geometries and local singularities. Furthermore, it adapts well to the exigences of modern computing platforms. In this work we investigate multi-colored Gaus-Seidel type smoothers, the power(q)-pattern enhanced multi-colored ILU(p,q) smoothers with fill-ins, and factorized sparse approximate inverse (FSAI) smoothers. These approaches provide efficient smoothers with a high degree of parallelism. We describe the configuration of our smoothers in the context of the portable lmpLAtoolbox and the HiFlow 3 parallel finite element package. In our approach, a single source code can be used across diverse platforms including multicore CPUs and GPUs. Highly optimized implementations are hidden behind a unified user interface. Efficiency and scalability of our multigrid solvers are demonstrated by means of a comprehensive performance analysis on multicore CPUs and GPUs.

11 citations

Proceedings ArticleDOI
Xiulin Li1, Shijun Liu1, Li Pan1, Yuliang Shi1, Xiangxu Meng1 
02 Jul 2018
TL;DR: A novel tandem queuing network with a parallel multi-station multi-server system as an analytical model for service clouds serving composite service application jobs containing parallelizable tasks is described.
Abstract: Performance analysis is important for service clouds serving composite service application jobs containing parallelizable tasks, for optimizing the degree of parallelism (DOP) and resource allocation schemes could improve performance obviously. In this paper, we describe a novel tandem queuing network with a parallel multi-station multi-server system as an analytical model for service clouds serving composite service application jobs. We design a partition method (termed the 'pleasing partition') to help us propose an analytical model for parallelizable service which is the vital fraction of composite service. After that, we could obtain a complete probability distribution of response time, waiting time and other important performance metrics calculated by our proposed analytical model. Thus, to use this model, cloud operators could determine proper job configurations and resource allocation schemes, for achieving specific QoS (Quality of Service). Extensive simulations are conducted to validate that our analytical model has high accuracy in predicting performance metrics of composite service application jobs.

11 citations

Proceedings ArticleDOI
13 Oct 2019
TL;DR: The highlights of ros-dmapf are its scalability and a high degree of parallelism, and its evaluation against some other MAPF solvers shows that the system performs well.
Abstract: Multi-Agent Path Finding (MAPF) problems are traditionally solved in a centralized manner. There are works focusing on completeness, optimality, performance, or a tradeoff between them. However, there are only a few works based on spatial distribution. In this paper, we introduce ros-dmapf, a distributed MAPF solver. It consists of multiple MAPF sub-solvers, which---besides solving their assigned sub-problems---interact with each other to solve a given MAPF problem. In the current implementation, the sub-solvers are answer set planning systems for multiple agents, and are created based on spatial distribution of the problem. Interactions between components of ros-dmapf are facilitated by the Robot Operating System (ROS). The highlights of ros-dmapf are its scalability and a high degree of parallelism. We empirically evaluate ros-dmapf using the move-only domain of the asprilo system and results suggest that ros-dmapf scales up well. For instance, ros-dmapf gives a solution of length around 600 for a MAPF problem with 2000 robots in randomly generated 100×100 obstacle-free maps---a problem beyond the capability of a single sub-solver---within 7 minutes on a consumer laptop. We also evaluate ros-dmapf against some other MAPF solvers and results show that the system performs well. We also discuss possible improvements for future work.

11 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
85% related
Scheduling (computing)
78.6K papers, 1.3M citations
83% related
Network packet
159.7K papers, 2.2M citations
80% related
Web service
57.6K papers, 989K citations
80% related
Quality of service
77.1K papers, 996.6K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20221
202147
202048
201952
201870
201775