Topic

Pipeline (computing)

About: Pipeline (computing) is a research topic. Over the lifetime, 26760 publications have been published within this topic receiving 204305 citations. The topic is also known as: data pipeline & computational pipeline.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Branch Classification: A New Mechanism for Improving Branch Predictor Performance

[...]

Po-Yung Chang¹, Eric Hao¹, Tse-Yu Yeh², Yale N. Patt¹•Institutions (2)

University of Michigan¹, Intel²

01 Apr 1996-International Journal of Parallel Programming

TL;DR: This paper proposes branch classification, a methodology for building more accurate branch predictors, and an example classification scheme is given and a new hybrid predictor is built based on this scheme which achieves a higher prediction accuracy than any branch predictor previously reported in the literature.

...read moreread less

Abstract: There is wide agreement that one of the most significant impediments to the performance of current and future pipelined superscalar processors is the presence of conditional branches in the instruction stream. Speculative execution is one solution to the branch problem, but speculative work is discarded if a branch is mispredicted. For it to be effective, speculative execution requires a very accurate branch predictor; 95% accuracy is not good enough. This paper proposes branch classification, a methodology for building more accurate branch predictors. Branch classification allows an individual branch instruction to be associated with the branch predictor best suited to predict its direction. Using this approach, a hybrid branch predictor can be constructed such that each component branch predictor predicts those branches for which it is best suited. To demonstrate the usefulness of branch classification, an example classification scheme is given and a new hybrid predictor is built based on this scheme which achieves a higher prediction accuracy than any branch predictor previously reported in the literature.

...read moreread less

72 citations

Posted Content•

FermiKit: assembly-based variant calling for Illumina resequencing data

[...]

Heng Li¹•Institutions (1)

Broad Institute¹

24 Apr 2015-arXiv: Genomics

TL;DR: FermiKit as mentioned in this paper is a variant calling pipeline for Illumina data that de novo assembles short reads and then maps the assembly against a reference genome to call SNPs, short insertions/deletions (INDELs) and structural variations (SVs).

...read moreread less

Abstract: Summary: FermiKit is a variant calling pipeline for Illumina data. It de novo assembles short reads and then maps the assembly against a reference genome to call SNPs, short insertions/deletions (INDELs) and structural variations (SVs). FermiKit takes about one day to assemble 30-fold human whole-genome data on a modern 16-core server with 85GB RAM at the peak, and calls variants in half an hour to an accuracy comparable to the current practice. FermiKit assembly is a reduced representation of raw data while retaining most of the original information. Availability and implementation: this https URL Contact: hengli@broadinstitute.org

...read moreread less

72 citations

Journal Article•DOI•

Reconstructing the calibrated strain signal in the Advanced LIGO detectors

[...]

Aaron Viets¹, Aaron Viets², Madeline Wade³, A. L. Urban⁴, S. Kandhasamy, J. Betzwieser, Duncan A. Brown⁵, Jordi Burguet-Castell, C. Cahillane⁴, Evan Goetz⁶, K. Izumi, S. Karki⁷, J. S. Kissel, G. Mendell, Richard L. Savage, X. Siemens¹, D. Tuyenbayev⁸, Alan J. Weinstein⁴ - Show less +14 more•Institutions (8)

University of Wisconsin–Milwaukee¹, Concordia University Wisconsin², Kenyon College³, California Institute of Technology⁴, Syracuse University⁵, University of Michigan⁶, University of Oregon⁷, University of Texas at Austin⁸

10 May 2018-Classical and Quantum Gravity

TL;DR: The gstlal calibration pipeline is also used in high latency to recalibrate the data, which is necessary due mainly to online dropouts in the calibrated data and identified improvements to the calibration models or filters.

...read moreread less

Abstract: Advanced LIGO's raw detector output needs to be calibrated to compute dimensionless strain h(t). Calibrated strain data is produced in the time domain using both a low-latency, online procedure and a high-latency, offline procedure. The low-latency h(t) data stream is produced in two stages, the first of which is performed on the same computers that operate the detector's feedback control system. This stage, referred to as the front-end calibration, uses infinite impulse response (IIR) filtering and performs all operations at a 16 384 Hz digital sampling rate. Due to several limitations, this procedure currently introduces certain systematic errors in the calibrated strain data, motivating the second stage of the low-latency procedure, known as the low-latency gstlal calibration pipeline. The gstlal calibration pipeline uses finite impulse response (FIR) filtering to apply corrections to the output of the front-end calibration. It applies time-dependent correction factors to the sensing and actuation components of the calibrated strain to reduce systematic errors. The gstlal calibration pipeline is also used in high latency to recalibrate the data, which is necessary due mainly to online dropouts in the calibrated data and identified improvements to the calibration models or filters.

...read moreread less

72 citations

Patent•

Integrated data link controller with synchronous link interface and asynchronous host processor interface

[...]

Joseph Kevin Farrell¹, Jeffrey Scott Gordon¹, Robert Vincent Jenness¹, Daniel C Kuhl¹, Timothy Vincent Lee¹, Tony Edwin Parker¹ - Show less +2 more•Institutions (1)

IBM¹

25 Feb 1991

Abstract: A single chip integrated data link control (IDLC) device provides full duplex data throughput and versatile protocol adaptation between variably configured time channels on a high speed TDM digital link (e.g. T-1 or T-3 line) and a host data processing system. The device can handle multiple channels of voice and varied protocol data traffic, and thereby is suited for use in primary rate ISDN (Integrated Services Digital Network) applications. Synchronous and asynchronous special purpose logic sections in the device respectively interface with the network and a bus extending to external processing systems. Logic in the synchronous section forms plural-stage receive and transmit processing pipelines relative to the network interface. A "resource manager" element (RSM) and time swap (TS) RAM memory operate to dynamically vary states in these pipelines in synchronism with channel time slots at the network interface, whereby each pipeline operates in multitasking mode to perform plural functions relative to each channel during each time slot. The device also includes integrated memory queues in which communication data and channel status information are stacked relative to the device interfaces. Capacities and modes of operation of these queues are selected to minimize effects on chip size, throughput and cost, while supporting operations in the synchronous section pipelines so that critical time dependencies between consecutive pipeline stages, and between the pipelines and external processors, are lessened.

...read moreread less

72 citations

Proceedings Article•DOI•

GraphACT: Accelerating GCN Training on CPU-FPGA Heterogeneous Platforms

[...]

Hanqing Zeng¹, Viktor K. Prasanna¹•Institutions (1)

University of Southern California¹

23 Feb 2020

TL;DR: In this paper, a novel accelerator for training GCNs on CPU-FPGA heterogeneous systems, by incorporating multiple algorithm-architecture co-optimizations, is proposed.

...read moreread less

Abstract: Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs. It is challenging to accelerate training of GCNs, due to (1) substantial and irregular data communication to propagate information within the graph, and (2) intensive computation to propagate information along the neural network layers. To address these challenges, we design a novel accelerator for training GCNs on CPU-FPGA heterogeneous systems, by incorporating multiple algorithm-architecture co-optimizations. We first analyze the computation and communication characteristics of various GCN training algorithms, and select a subgraph-based algorithm that is well suited for hardware execution. To optimize the feature propagation within subgraphs, we propose a light-weight pre-processing step based on a graph theoretic approach. Such pre-processing performed on the CPU significantly reduces the memory access requirements and the computation to be performed on the FPGA. To accelerate the weight update in GCN layers, we propose a systolic array based design for efficient parallelization. We integrate the above optimizations into a complete hardware pipeline, and analyze its load-balance and resource utilization by accurate performance modeling. We evaluate our design on a Xilinx Alveo U200 board hosted by a 40-core Xeon server. On three large graphs, we achieve an order of magnitude training speedup with negligible accuracy loss, compared with state-of-the-art implementation on a multi-core platform.

...read moreread less

72 citations

Collapse

Network Information

Performance

Metrics

26,760

Papers

229,716

Citations

No. of papers in the topic in previous years
Year	Papers
2022	18
2021	1,066
2020	1,556
2019	1,793
2018	1,754
2017	1,548

Pipeline (computing)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics