scispace - formally typeset
Search or ask a question
Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.


Papers
More filters
Proceedings ArticleDOI
27 Mar 1997
TL;DR: In this paper, a hybrid shape recognition system with an optical Hough transform preprocessor is presented, which is achieved by a micro-lens array processor using incoherent light, the processor accepts direct optical input without any extra image converter being required.
Abstract: We present a hybrid shape recognition system with an optical Hough transform preprocessor. A very compact design is achieved by a microlens array processor. Using incoherent light, the processor accepts direct optical input without any extra image converter being required. The microlens array processor is constructed of a crossed assembly of two low-cost plastic lenticular arrays and a Hough transform weight mask. It is integrated in a compact objective barrel, which is attached directly to a CCD-camera like a conventional camera lens. The system delivers one output signal for each of the 64 X 64 microlenses. The resolution of the microlenses and the weight mask results in an extremely high degree of parallelism. It corresponds to a connection of 4k inputs and outputs by 16M weights in parallel. The feature extraction tasks of lower computational complexity and the classification, which can be performed in real-time, are implemented as a neural network on a personal computer.

7 citations

Proceedings ArticleDOI
S. Wahl1, Zhe Wang1, C. Qiu1, Marek Wroblewski1, L. Rockstroh1, Sven Simon1 
01 Dec 2010
TL;DR: This paper proposes a relaxation to the context update of JPEG-LS by delaying the update procedure, in order to achieve a guaranteed degree of parallelism with a negligible effect on the compression ratio.
Abstract: Many state-of-the-art lossless image compression standards feature adaptive error modelling. This, however, leads to data dependency loops of the compression scheme such that a parallel compression of neighboring pixels is not possible. In this paper, we propose a relaxation to the context update of JPEG-LS by delaying the update procedure, in order to achieve a guaranteed degree of parallelism with a negligible effect on the compression ratio. The lossless mode of JPEG-LS including the run-mode is considered. A descewing scheme is provided generating a bit-stream that preserves the order needed for the decoder to mimic the prediction in a consistent way. This system is memory efficient in a sense that no additional memory for the large context-set is needed.

7 citations

Journal ArticleDOI
TL;DR: This work proposes a new approach to analyzing degree of parallelism for concurrent workflow processes with shared resources and demonstrates the application and evaluates the effectiveness in a real-world business scenario.
Abstract: Degree of parallelism is an important factor in workflow process management, because it is useful to accurately estimate the server costs and schedule severs in workflow processes. However, existing methods that are developed to compute degree of parallelism neglect to consider activities with uncertain execution time. In addition, these methods are limited in dealing with the situation where activities in multiple concurrent workflow processes use shared resources. To address the limitations, we propose a new approach to analyzing degree of parallelism for concurrent workflow processes with shared resources. Superior over the existing methods, our approach can compute degree of parallelism for multiple concurrent workflow processes that have activities with uncertain execution time and shared resources. Expectation degree of parallelism is useful to estimate the server costs of the workflow processes, and maximum degree of parallelism can guide managers to allocate severs or virtual machines based on the business requirement. We demonstrate the application of the approach and evaluate the effectiveness in a real-world business scenario.

7 citations

Journal ArticleDOI
TL;DR: In this paper, a parallel speedup model that accounts for the variations on the average data-access delay is proposed to describe the limiting effect of the memory wall on parallel speedups in homogeneous shared-memory architectures.
Abstract: After Amdahl’s trailblazing work, many other authors proposed analytical speedup models but none have considered the limiting effect of the memory wall. These models exploited aspects such as problem-size variation, memory size, communication overhead, and synchronization overhead, but data-access delays are assumed to be constant. Nevertheless, such delays can vary, for example, according to the number of cores used and the ratio between processor and memory frequencies. Given the large number of possible configurations of operating frequency and number of cores that current architectures can offer, suitable speedup models to describe such variations among these configurations are quite desirable for off-line or on-line scheduling decisions. This work proposes a new parallel speedup model that accounts for the variations on the average data-access delay to describe the limiting effect of the memory wall on parallel speedups in homogeneous shared-memory architectures. Analytical results indicate that the proposed modeling can capture the desired behavior while experimental hardware results validate the former. Additionally, we show that when accounting for parameters that reflect the intrinsic characteristics of the applications, such as the degree of parallelism and susceptibility to the memory wall, our proposal has significant advantages over machine-learning-based modeling. Moreover, our experiments show that conventional machine-learning modeling, besides being black-boxed, needs about one order of magnitude more measurements to reach the same level of accuracy achieved by the proposed model.

7 citations

Journal ArticleDOI
TL;DR: This work designs a system to process multiple frames of a pixel in parallel, which enables better utilization of GPU memory and also makes it possible to design an efficient out‐of‐core algorithm required in rendering real‐world animations.
Abstract: We present an efficient and scalable system that enables programmable motion effects on GPUs. Our system is based on the framework proposed by Schmid et al. [SSBG10] that extends the concept of a surface shader to that of a programmable motion effect. While capable of expressing a variety of motion depiction styles, the execution of motion effect programs requires global knowledge about all portions of an object's surface that passes in front of a pixel during an arbitrarily long period of time, resulting in extremely high memory usage and significantly restricting the degree of parallelism of typical GPU rendering algorithms that parallelize computations over pixels in each frame of animations. To address this problem, we design our system to process multiple frames of a pixel in parallel. This new parallelization approach enables better utilization of GPU memory and also makes it possible to design an efficient out-of-core algorithm required in rendering real-world animations. We also develop an analytical visibility algorithm to resolve depth conflicts of objects, reducing the required temporal resampling rate and further exposing parallelism. Experiments show that we are able to handle very large scenes and improve runtime performance up to an order of magnitude. © 2012 Wiley Periodicals, Inc.

6 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
85% related
Scheduling (computing)
78.6K papers, 1.3M citations
83% related
Network packet
159.7K papers, 2.2M citations
80% related
Web service
57.6K papers, 989K citations
80% related
Quality of service
77.1K papers, 996.6K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20221
202147
202048
201952
201870
201775