scispace - formally typeset
Search or ask a question
Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Algorithms for recursive implementation of the eigendecomposition (ED) of the autocorrelation matrix and SVD of the data matrix and the ED/SVD trade-off is discussed.

12 citations

Proceedings ArticleDOI
28 Jan 2014
TL;DR: This paper uses an estimated MVP on GPU and the accurate MVP to refine the motion vector on CPU to overcome the constraint from MVP and presents a high quality H.265/HEVC motion estimation implementation with the cooperation of CPU and GPU.
Abstract: This paper presents a high quality H.265/HEVC motion estimation implementation with the cooperation of CPU and GPU. The data dependency from MVP (Motion Vector Predictor) restricts the degree of parallelism on GPU. To overcome the constraint from MVP, we propose to use an estimated MVP on GPU and the accurate MVP to refine the motion vector on CPU. GPU fully utilizes its tremendous parallel computing ability without the restriction from MVP. CPU makes up for the deviation from GPU with a small range refinement. Encoding speed benefits from the high degree of parallelism and compression performance is maintained by the CPU refinement. Experimental result shows that the speedup achieves 2.39 times and 32.77 times in the whole ×265 encoder with CPU SIMD (Single Instruction Multiple Data) on and off, respectively. On the other hand, the quality degradation is negligible with only 0.05% increase of BD-rate.

12 citations

Journal ArticleDOI
01 Mar 1990
TL;DR: A programmable systolic device is designed to cater for all tasks of image processing based on mathematical morphology that involves extremely short clock cycles and a high degree of parallelism.
Abstract: Systolic array architectures are favourable for special purpose systems as they are simple and offer a high degree of concurrency. A programmable systolic device is designed to cater for all tasks of image processing based on mathematical morphology. The design consists of a systolic memory matrix accessible via a rotation operation by a linear systolic array of simple processing elements. The instruction set consists of 1-bit assignmments, logical and and or and shift operations on the memory. Thus extremely short clock cycles and a high degree of parallelism can be achieved.

12 citations

Proceedings ArticleDOI
04 Jun 2005
TL;DR: This paper will describe the principles and features of SMI++ as well as its integration with an industrial SCADA tool for use by the LHC experiments and it will be shown that such tools, can provide a very convenient mechanism for the automation of large scale, high complexity, applications.
Abstract: The new LHC experiments at CERN have very large numbers of channels to operate. In order to be able to configure and monitor such large systems, a high degree of parallelism is necessary. The control system is built as a hierarchy of sub-systems distributed over several computers. A toolkit $SMI++, combining two approaches: finite state machines and rule-based programming, allows for the description of the various sub-systems as decentralized deciding entities, reacting in real-time to changes in the system, thus providing for the automation of standard procedures and for the automatic recovery from error conditions in a hierarchical fashion. In this paper we describe the principles and features of SMI++ as well as its integration with an industrial SCADA tool for use by the LHC experiments and we try to show that such tools, can provide a very convenient mechanism for the automation of large scale, high complexity, applications

12 citations

Patent
Michael Keith1
11 Dec 1992
TL;DR: In this paper, a single-instruction, multiple-data (SIMD) architecture is adopted to exploit the high degree of parallelism inherent in many video signal processing algorithms.
Abstract: Single-instruction multiple-data is a new class of integrated video signal processors especially suited for real-time processing of two-dimensional images. The single-instruction, multiple-data architecture is adopted to exploit the high degree of parallelism inherent in many video signal processing algorithms. Features have been added to the architecture which support conditional execution and sequencing--an inherent limitation of traditional single-instruction multiple-data machines. A separate transfer engine offloads transaction processing from the execution core, allowing balancing of input/output and compute resources--a critical factor in optimizing performance for video processing. These features, coupled with a scalable architecture allow a united programming model and application driven performance.

12 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
85% related
Scheduling (computing)
78.6K papers, 1.3M citations
83% related
Network packet
159.7K papers, 2.2M citations
80% related
Web service
57.6K papers, 989K citations
80% related
Quality of service
77.1K papers, 996.6K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20221
202147
202048
201952
201870
201775