scispace - formally typeset
Search or ask a question
Topic

Pipeline (computing)

About: Pipeline (computing) is a research topic. Over the lifetime, 26760 publications have been published within this topic receiving 204305 citations. The topic is also known as: data pipeline & computational pipeline.


Papers
More filters
Patent
19 Jan 1982
TL;DR: In this paper, a pipeline of programmable serial neighborhood stages is used to automatically check for adherence to a set of predetermined geometrical constraints or design rules used in the fabrication of electronic components such as integrated circuit devices.
Abstract: A pipeline of programmable serial neighborhood stages is used to automatically check for adherence to a set of predetermined geometrical constraints or design rules used in the fabrication of electronic components such as integrated circuit devices. In a disclosed embodiment, bit-map representations of several IC masks are superimposed in one composite image matrix for processing in the pipeline. Advantages in processing speed, especially for design rule checking requiring data from more than one mask, are obtained since all necessary data is available to the stages in the pipeline.

54 citations

Journal ArticleDOI
TL;DR: The hardware implementation of a simple, fast technique for depth estimation based on phase measurement, which avoids the problem of phase warping and is much less susceptible to camera noise and distortion than standard block-matching stereo systems is presented.
Abstract: We present the hardware implementation of a simple, fast technique for depth estimation based on phase measurement. This technique avoids the problem of phase warping and is much less susceptible to camera noise and distortion than standard block-matching stereo systems. The architecture exploits the parallel computing resources of FPGA devices to achieve a computation speed of 65 megapixels per second. For this purpose, we have designed a fine-grain pipeline structure that can be arranged with a customized frame-grabber module to process 52 frames per second at a resolution of 1280times960 pixels. We have measured the system's degradation due to bit quantization errors and compared its performance with other previous approaches. We have also used different Gabor-scale circuits, which can be selected by the user according to the application addressed and typical image structure in the target scenario

54 citations

Journal ArticleDOI
TL;DR: A novel architecture for implementing fast algorithms on FPGAs, which effectively pipeline the Winograd/FFT processing element (PE) engine and initiate multiple PEs through parallelization, and proposes an analytical model to predict the resource usage and the performance.
Abstract: In recent years, convolutional neural networks (CNNs) have become widely adopted for computer vision tasks. Field-programmable gate arrays (FPGAs) have been adequately explored as a promising hardware accelerator for CNNs due to its high performance, energy efficiency, and reconfigurability. However, prior FPGA solutions based on the conventional convolutional algorithm is often bounded by the computational capability of FPGAs (e.g., the number of DSPs). To address this problem, the feature maps are transformed to a special domain using fast algorithms to reduce the arithmetic complexity. Winograd and fast Fourier transformation (FFT), as fast algorithm representatives, first transform input data and filter to Winograd or frequency domain, then perform element-wise multiplication, and apply inverse transformation to get the final output. In this paper, we propose a novel architecture for implementing fast algorithms on FPGAs. Our design employs line buffer structure to effectively reuse the feature map data among different tiles. We also effectively pipeline the Winograd/FFT processing element (PE) engine and initiate multiple PEs through parallelization. Meanwhile, there exists a complex design space to explore. We propose an analytical model to predict the resource usage and the performance. Then, we use the model to guide a fast design space exploration. Experiments using the state-of-the-art CNNs demonstrate the best performance and energy efficiency on FPGAs. We achieve 854.6 and 2479.6 GOP/s for AlexNet and VGG16 on Xilinx ZCU102 platform using Winograd. We achieve 130.4 GOP/s for Resnet using Winograd and 201.1 GOP/s for YOLO using FFT on Xilinx ZC706 platform.

54 citations

Proceedings ArticleDOI
22 Aug 2007
TL;DR: This paper proposes a simple and effective linear pipeline architecture for trie-based IP lookup that achieves evenly distributed memory while realizing high throughput of one lookup per clock cycle and offers more freedom in mapping trie nodes to pipeline stages by supporting nops.
Abstract: Rapid growth in network link rates poses a strong demand on high speed IP lookup engines. Trie-based architectures are natural candidates for pipelined implementation to provide high throughput. However, simply mapping a trie level onto a pipeline stage results in unbalanced memory distribution over different stages. To address this problem, several novel pipelined architectures have been proposed. But their non-linear pipeline structure results in some new performance issues such as throughput degradation and delay variation. In this paper, we propose a simple and effective linear pipeline architecture for trie-based IP lookup. Our architecture achieves evenly distributed memory while realizing high throughput of one lookup per clock cycle. It offers more freedom in mapping trie nodes to pipeline stages by supporting nops. We implement our design as well as the state-of-the-art solutions on a commodity FPGA and evaluate their performance. Post place and route results show that our design can achieve a throughput of 80 Gbps, up to twice the throughput of reference solutions. It has constant delay, maintains input order, and supports incremental route updates without disrupting the ongoing IP lookup operations.

54 citations

Proceedings ArticleDOI
TL;DR: This paper describes the data reduction pipeline of the GPI science instrument, which reduces an ensemble of highcontrast spectroscopic or polarimetric raw science images and calibration data into a final dataset ready for scientific analysis.
Abstract: The Gemini Planet Imager (GPI) high-contrast adaptive optics system, which is currently under construction for Gemini South, has an IFS as its science instrument. This paper describes the data reduction pipeline of the GPI science instrument. Written in IDL, with a modular architecture, this pipeline reduces an ensemble of highcontrast spectroscopic or polarimetric raw science images and calibration data into a final dataset ready for scientific analysis. It includes speckle suppression techniques such as angular and spectral differential imaging that are necessary to achieve extreme contrast performances for which the instrument is designed. This paper presents also raw GPI IFS simulated data developed to test the pipeline.

54 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
86% related
Scalability
50.9K papers, 931.6K citations
85% related
Server
79.5K papers, 1.4M citations
82% related
Electronic circuit
114.2K papers, 971.5K citations
82% related
CMOS
81.3K papers, 1.1M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202218
20211,066
20201,556
20191,793
20181,754
20171,548