scispace - formally typeset
Search or ask a question
Topic

Pipeline (computing)

About: Pipeline (computing) is a research topic. Over the lifetime, 26760 publications have been published within this topic receiving 204305 citations. The topic is also known as: data pipeline & computational pipeline.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes branch classification, a methodology for building more accurate branch predictors, and an example classification scheme is given and a new hybrid predictor is built based on this scheme which achieves a higher prediction accuracy than any branch predictor previously reported in the literature.
Abstract: There is wide agreement that one of the most significant impediments to the performance of current and future pipelined superscalar processors is the presence of conditional branches in the instruction stream. Speculative execution is one solution to the branch problem, but speculative work is discarded if a branch is mispredicted. For it to be effective, speculative execution requires a very accurate branch predictor; 95% accuracy is not good enough. This paper proposes branch classification, a methodology for building more accurate branch predictors. Branch classification allows an individual branch instruction to be associated with the branch predictor best suited to predict its direction. Using this approach, a hybrid branch predictor can be constructed such that each component branch predictor predicts those branches for which it is best suited. To demonstrate the usefulness of branch classification, an example classification scheme is given and a new hybrid predictor is built based on this scheme which achieves a higher prediction accuracy than any branch predictor previously reported in the literature.

72 citations

Posted Content
Heng Li1
TL;DR: FermiKit as mentioned in this paper is a variant calling pipeline for Illumina data that de novo assembles short reads and then maps the assembly against a reference genome to call SNPs, short insertions/deletions (INDELs) and structural variations (SVs).
Abstract: Summary: FermiKit is a variant calling pipeline for Illumina data. It de novo assembles short reads and then maps the assembly against a reference genome to call SNPs, short insertions/deletions (INDELs) and structural variations (SVs). FermiKit takes about one day to assemble 30-fold human whole-genome data on a modern 16-core server with 85GB RAM at the peak, and calls variants in half an hour to an accuracy comparable to the current practice. FermiKit assembly is a reduced representation of raw data while retaining most of the original information. Availability and implementation: this https URL Contact: hengli@broadinstitute.org

72 citations

Journal ArticleDOI
TL;DR: The gstlal calibration pipeline is also used in high latency to recalibrate the data, which is necessary due mainly to online dropouts in the calibrated data and identified improvements to the calibration models or filters.
Abstract: Advanced LIGO's raw detector output needs to be calibrated to compute dimensionless strain h(t). Calibrated strain data is produced in the time domain using both a low-latency, online procedure and a high-latency, offline procedure. The low-latency h(t) data stream is produced in two stages, the first of which is performed on the same computers that operate the detector's feedback control system. This stage, referred to as the front-end calibration, uses infinite impulse response (IIR) filtering and performs all operations at a 16 384 Hz digital sampling rate. Due to several limitations, this procedure currently introduces certain systematic errors in the calibrated strain data, motivating the second stage of the low-latency procedure, known as the low-latency gstlal calibration pipeline. The gstlal calibration pipeline uses finite impulse response (FIR) filtering to apply corrections to the output of the front-end calibration. It applies time-dependent correction factors to the sensing and actuation components of the calibrated strain to reduce systematic errors. The gstlal calibration pipeline is also used in high latency to recalibrate the data, which is necessary due mainly to online dropouts in the calibrated data and identified improvements to the calibration models or filters.

72 citations

Patent
25 Feb 1991
Abstract: A single chip integrated data link control (IDLC) device provides full duplex data throughput and versatile protocol adaptation between variably configured time channels on a high speed TDM digital link (e.g. T-1 or T-3 line) and a host data processing system. The device can handle multiple channels of voice and varied protocol data traffic, and thereby is suited for use in primary rate ISDN (Integrated Services Digital Network) applications. Synchronous and asynchronous special purpose logic sections in the device respectively interface with the network and a bus extending to external processing systems. Logic in the synchronous section forms plural-stage receive and transmit processing pipelines relative to the network interface. A "resource manager" element (RSM) and time swap (TS) RAM memory operate to dynamically vary states in these pipelines in synchronism with channel time slots at the network interface, whereby each pipeline operates in multitasking mode to perform plural functions relative to each channel during each time slot. The device also includes integrated memory queues in which communication data and channel status information are stacked relative to the device interfaces. Capacities and modes of operation of these queues are selected to minimize effects on chip size, throughput and cost, while supporting operations in the synchronous section pipelines so that critical time dependencies between consecutive pipeline stages, and between the pipelines and external processors, are lessened.

72 citations

Proceedings ArticleDOI
23 Feb 2020
TL;DR: In this paper, a novel accelerator for training GCNs on CPU-FPGA heterogeneous systems, by incorporating multiple algorithm-architecture co-optimizations, is proposed.
Abstract: Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs. It is challenging to accelerate training of GCNs, due to (1) substantial and irregular data communication to propagate information within the graph, and (2) intensive computation to propagate information along the neural network layers. To address these challenges, we design a novel accelerator for training GCNs on CPU-FPGA heterogeneous systems, by incorporating multiple algorithm-architecture co-optimizations. We first analyze the computation and communication characteristics of various GCN training algorithms, and select a subgraph-based algorithm that is well suited for hardware execution. To optimize the feature propagation within subgraphs, we propose a light-weight pre-processing step based on a graph theoretic approach. Such pre-processing performed on the CPU significantly reduces the memory access requirements and the computation to be performed on the FPGA. To accelerate the weight update in GCN layers, we propose a systolic array based design for efficient parallelization. We integrate the above optimizations into a complete hardware pipeline, and analyze its load-balance and resource utilization by accurate performance modeling. We evaluate our design on a Xilinx Alveo U200 board hosted by a 40-core Xeon server. On three large graphs, we achieve an order of magnitude training speedup with negligible accuracy loss, compared with state-of-the-art implementation on a multi-core platform.

72 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
86% related
Scalability
50.9K papers, 931.6K citations
85% related
Server
79.5K papers, 1.4M citations
82% related
Electronic circuit
114.2K papers, 971.5K citations
82% related
CMOS
81.3K papers, 1.1M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202218
20211,066
20201,556
20191,793
20181,754
20171,548