scispace - formally typeset
Search or ask a question
Topic

Pipeline (computing)

About: Pipeline (computing) is a research topic. Over the lifetime, 26760 publications have been published within this topic receiving 204305 citations. The topic is also known as: data pipeline & computational pipeline.


Papers
More filters
Journal ArticleDOI
TL;DR: The architecture and performance of the GRAPE-4 system, a massively parallel special-purpose computer for N-body simulation of gravitational collisional systems, is described.
Abstract: In this paper, we describe the architecture and performance of the GRAPE-4 system, a massively parallel special-purpose computer for N-body simulation of gravitational collisional systems. The calculation cost of N-body simulation of collisional self-gravitating system is O(N3). Thus, even with present-day supercomputers, the number of particles one can handle is still around 10,000. In N-body simulations, almost all computing time is spent calculating the force between particles, since the number of interactions is proportional to the square of the number of particles. Computational cost of the rest of the simulation, such as the time integration and the reduction of the result, is generally proportional to the number of particles. The calculation of the force between particles can be greatly accelerated by means of a dedicated special-purpose hardware. We have developed a series of hardware systems, the GRAPE (GRAvity PipE) systems, which perform the force calculation. They are used with a general-purpose host computer which performs the rest of the calculation. The GRAPE-4 system is our newest hardware, completed in 1995 summer. Its peak speed is 1.08 TFLOPS. This speed is achieved by running 1692 pipeline large-scale integrated circuits (LSIs), each providing 640 MFLOPS, in parallel.

193 citations

Journal ArticleDOI
TL;DR: In this paper, a probabilistic approach is adopted for the assessment of remaining life of a pressurised pipeline containing active corrosion defects, and the associated variables are represented by normal or non-normal probability distributions, relative contribution of the random variables and the sensitivity of the reliability index to the change in variance of random variables is also investigated.

192 citations

Patent
31 Jan 1990
TL;DR: In this paper, the microcode execution unit determines that a new operation is required, an entry is inserted into the result queue, which includes all the information needed by the retire unit to retire the result once the result is available from the respective functional unit.
Abstract: To increase the performance of a pipelined processor executing various classes of instructions, the classes of instructions are executed by respective functional units(164-167) which are independently controlled and operated in parallel. The classes of instructions include integer instructions (164) floating point instructions (165), multiply instructions (166), and divide instructions (161). The integer unit, which also performs shift operations, is controlled by the microcode execution unit (26) to handle the wide variety of integer and shift operations included in a complex, variable-length instruction set. The other functional units need only accept a control command to initiate the operation to be performed by the functional unit. The retiring of the results of the instructions need not be controlled by the microcode execution unit, but instead is delegated to a separate retire unit (173) that services a result queue (172). When the microcode execution unit determines that a new operation is required, an entry is inserted into the result queue. The entry includes all the information needed by the retire unit to retire the result once the result is available from the respective functional unit. The retire unit services the result queue by reading a tag in the entry at the head of the queue to determine the functional unit that is to provide the result. Once the result is available and the destination specified by the entry is also available, the result is retired in accordance with the entry, and the entry is removed from the queue.

191 citations

Proceedings ArticleDOI
01 May 2001
TL;DR: The methodology that was used to validate a microprocessor simulator against a Compaq DS-10L workstation, which contains an Alpha 21264 processor, and how low-level optimizations reduce average error from 40% to less than 20% on macrobenchmarks drawn from the SPEC2000 suite is described.
Abstract: We measure the experimental error that arises from the use of non-validated simulators in computer architecture research, with the goal of increasing the rigor of simulation-based studies. We describe the methodology that we used to validate a microprocessor simulator against a Compaq DS-10L workstation, which contains an Alpha 21264 processor. Our evaluation suite consists of a set of 21 microbenchmarks that stress different aspects of the 21264 microarchitecture. Using the microbenchmark suite as the set of workloads, we describe how we reduced our simulator error to an arithmetic mean of 2%, and include details about the specific aspects of the pipeline that required extra care to reduce the error. We show how these low-level optimizations reduce average error from 40% to less than 20% on macrobenchmarks drawn from the SPEC2000 suite. Finally, we examine the degree to which performance optimizations are stable across different simulators, showing that researchers would draw different conclusions, in some cases, if using validated simulators.

191 citations

Journal ArticleDOI
01 May 2000
TL;DR: Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain just as tight as predictions for direct-mapped caches.
Abstract: This paper contributes a comprehensive study of a framework to bound worst-case instruction cache performance for caches with arbitrary levels of associativity. The framework is formally introduced, operationally described and its correctness is shown. Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain just as tight as predictions for direct-mapped caches. The low cache simulation overhead allows interactive use of the analysis tool and scales well with increasing associativity. The approach taken is based on a data-flow specification of the problem and provides another step toward worst-case execution time prediction of contemporary architectures and its use in schedulability analysis for hard real-time systems.

191 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
86% related
Scalability
50.9K papers, 931.6K citations
85% related
Server
79.5K papers, 1.4M citations
82% related
Electronic circuit
114.2K papers, 971.5K citations
82% related
CMOS
81.3K papers, 1.1M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202218
20211,066
20201,556
20191,793
20181,754
20171,548