scispace - formally typeset
Search or ask a question
Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.


Papers
More filters
Posted Content
TL;DR: A modularize graph processing framework, which focus on the whole executing procedure with the extremely different degree of parallelism, and design a novel conversion dispatcher to change processing module, to match the corresponding exchange point.
Abstract: High parallel framework has been proved to be very suitable for graph processing. There are various work to optimize the implementation in FPGAs, a pipeline parallel device. The key to make use of the parallel performance of FPGAs is to process graph data in pipeline model and take advantage of on-chip memory to realize necessary locality process. This paper proposes a modularize graph processing framework, which focus on the whole executing procedure with the extremely different degree of parallelism. The framework has three contributions. First, the combination of vertex-centric and edge-centric processing framework can been adjusting in the executing procedure to accommodate top-down algorithm and bottom-up algorithm. Second, owing to the pipeline parallel and finite on-chip memory accelerator, the novel edge-block, a block consist of edges vertex, achieve optimizing the way to utilize the on-chip memory to group the edges and stream the edges in a block to realize the stream pattern to pipeline parallel processing. Third, depending to the analysis of the block structure of nature graph and the executing characteristics during graph processing, we design a novel conversion dispatcher to change processing module, to match the corresponding exchange point.

5 citations

Proceedings ArticleDOI
13 Jun 2010
TL;DR: The key difference of the proposed CGRA based solution compared to FPGA and GPU based solutions is a much better match of the architecture and algorithm for the core computational need as well as the system level architectural need.
Abstract: A Coarse Grain Reconfigurable Architecture (CGRA) tailored for accelerating bio-informatics algorithms is proposed. The key innovation is a light weight bio-informatics processor that can be reconfigured to perform different Add Compare and Select operations of the popular sequencing algorithms. A programmable and scalable architectural platform instantiates an array of such processing elements and allows arbitrary partitioning and scheduling schemes and capable of solving complete sequencing algorithms including the sequential phases and deal with arbitrarily large sequences. The key difference of the proposed CGRA based solution compared to FPGA and GPU based solutions is a much better match of the architecture and algorithm for the core computational need as well as the system level architectural need. This claim is quantified for three popular sequencing algorithms: the Needleman-Wunsch, Smith-Waterman and HMMER. For the same degree of parallelism, we provide a 5 X and 15 X speed-up improvements compared to FPGA and GPU respectively. For the same size of silicon, the advantage grows by a factor of another 10 X.

4 citations

01 Jan 1999
TL;DR: The work discussed is performed within a project aiming at developing strategies to automatically determine optimal data allocation strategies in order to simplify system administration in high performance environments.
Abstract: Data placement is a key factor for high performance database systems This is particularly true for parallel database systems where data allocation must support both I/O parallelism and processing parallelism within complex queries and between independent queries and transactions Determining an effective data placement is a complex administration problem depending on many parameters including system architecture, database and workload characteristics, hardware configuration, etc Research and tool support has so far concentrated on data placement for base tables, especially for Shared Nothing (SN), eg [MD97] On the other hand, to our knowledge, data placement issues for architectures where multiple DBMS instances share access to the same disks (Shared Disk, Shared Everything, specific hybrid architectures) have not yet been investigated in a systematic way Furthermore, little work has been published on effective disk allocation of index structures and temporary data (eg, intermediate query results) However, these allocation problems gain increasing importance, eg in order to effectively utilize parallel database systems for decision support / data warehousing environments In the next section we discuss the index allocation problem in more detail and introduce a classification of various approaches that are already supported to some degree in commercial DBMS While SN offers only few options, the other architectures provide a higher flexibility because index allocation can be independent from the base table allocation For certain indexsupported queries, this can allow for order-of-magnitude savings in I/O and communication cost We then turn to the disk allocation of intermediate query results for which the allocation parameters can be chosen dynamically at query run time For the case of parallel hash joins, we outline how to determine an optimal approach supporting a high degree of parallelism The work discussed is performed within a project aiming at developing strategies to automatically determine optimal data allocation strategies in order to simplify system administration in high performance environments

4 citations

Journal ArticleDOI
TL;DR: This asymptotic model is built based on Amdahl's law, Eble's rule and statistical yield equations to derive the optimum number of cores with respect to "performance-averaged yield" and can predict possible impacts of different manycore processor configurations and process technology parameters on the performance average yield for given degree of parallelism.

4 citations

Patent
21 Mar 2016
TL;DR: In this paper, the authors proposed an adaptive power reduction for a solid-state storage device to dynamically control power consumption by using a power limit command from a host and power consumption feedback from the host.
Abstract: Embodiments are disclosed for adaptive power reduction for a solid-state storage device to dynamically control power consumption Aspects of the embodiments include receiving a power limit command from a host; receiving power consumption feedback; using the power limit command and the power consumption feedback to calculate a new degree of parallelism; using the new degree of parallelism to control one or more of: i) processor parallelism, including activation of different numbers of processors, ii) memory parallelism, including memory pool length; and iii) nonvolatile memory parallelism, including activation of different numbers of nonvolatile memory devices

4 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
85% related
Scheduling (computing)
78.6K papers, 1.3M citations
83% related
Network packet
159.7K papers, 2.2M citations
80% related
Web service
57.6K papers, 989K citations
80% related
Quality of service
77.1K papers, 996.6K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20221
202147
202048
201952
201870
201775