scispace - formally typeset
Search or ask a question
Topic

Pipeline (computing)

About: Pipeline (computing) is a research topic. Over the lifetime, 26760 publications have been published within this topic receiving 204305 citations. The topic is also known as: data pipeline & computational pipeline.


Papers
More filters
Journal ArticleDOI
TL;DR: This work presents a novel streaming CT framework that conceptualizes the reconstruction process as a steady flow of data across a computing pipeline, updating the reconstruction result immediately after the projections have been acquired.
Abstract: The recent emergence of various types of flat-panel x-ray detectors and C-arm gantries now enables the construction of novel imaging platforms for a wide variety of clinical applications. Many of these applications require interactive 3D image generation, which cannot be satisfied with inexpensive PC-based solutions using the CPU. We present a solution based on commodity graphics hardware (GPUs) to provide these capabilities. While GPUs have been employed for CT reconstruction before, our approach provides significant speedups by exploiting the various built-in hardwired graphics pipeline components for the most expensive CT reconstruction task, backprojection. We show that the timings so achieved are superior to those obtained when using the GPU merely as a multi-processor, without a drop in reconstruction quality. In addition, we also show how the data flow across the graphics pipeline can be optimized, by balancing the load among the pipeline components. The result is a novel streaming CT framework that conceptualizes the reconstruction process as a steady flow of data across a computing pipeline, updating the reconstruction result immediately after the projections have been acquired. Using a single PC equipped with a single high-end commodity graphics board (the Nvidia 8800 GTX), our system is able to process clinically-sized projection data at speeds meeting and exceeding the typical flat-panel detector data production rates, enabling throughput rates of 40-50 projections s(-1) for the reconstruction of 512(3) volumes.

250 citations

Journal ArticleDOI
01 May 2002
TL;DR: This study indicates that further pipelining can at best improve performance of integer programs by a factor of 2 over current designs, and proposes and evaluates a high-frequency design called a segmented instruction window.
Abstract: Microprocessor clock frequency has improved by nearly 40% annually over the past decade. This improvement has been provided, in equal measure, by smaller technologies and deeper pipelines. From our study of the SPEC 2000 benchmarks, we find that for a high-performance architecture implemented in 100nm technology, the optimal clock period is approximately 8 fan-out-of-four (FO4) inverter delays for integer benchmarks, comprised of 6 FO4 of useful work and an overhead of about 2 FO4. The optimal clock period for floating-point benchmarks is 6 FO4. We find these optimal points to be insensitive to latch and clock skew overheads. Our study indicates that further pipelining can at best improve performance of integer programs by a factor of 2 over current designs. At these high clock frequencies it will be difficult to design the instruction issue window to operate in a single cycle. Consequently, we propose and evaluate a high-frequency design called a segmented instruction window.

249 citations

Journal ArticleDOI
TL;DR: This paper examines common implementations of linear algebra algorithms, such as matrix-vector multiplication, matrix-matrix multiplication and the solution of linear equations for efficiency on a computer architecture which uses vector processing and has pipelined instruction execution.
Abstract: This paper examines common implementations of linear algebra algorithms, such as matrix-vector multiplication, matrix-matrix multiplication and the solution of linear equations. The different versions are examined for efficiency on a computer architecture which uses vector processing and has pipelined instruction execution. By using the advanced architectural features of such machines, one can usually achieve maximum performance, and tremendous improvements in terms of execution speed can be seen over conventional computers.

249 citations

Journal ArticleDOI
TL;DR: A pipeline structure of a transform decoder similar to a systolic array is developed to decode Reed-Solomon (RS) codes, using a modified Euclidean algorithm for computing the error-locator polynomial.
Abstract: A pipeline structure of a transform decoder similar to a systolic array is developed to decode Reed-Solomon (RS) codes. An important ingredient of this design is a modified Euclidean algorithm for computing the error-locator polynomial. The computation of inverse field elements is completely avoided in this modification of Euclid's algorithm. The new decoder is regular and simple, and naturally suitable for VLSI implementation. An example illustrating both the pipeline and systolic array aspects of this decoder structure is given for a (15,9) RS code.

247 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
86% related
Scalability
50.9K papers, 931.6K citations
85% related
Server
79.5K papers, 1.4M citations
82% related
Electronic circuit
114.2K papers, 971.5K citations
82% related
CMOS
81.3K papers, 1.1M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202218
20211,066
20201,556
20191,793
20181,754
20171,548