Topic

Pipeline (computing)

About: Pipeline (computing) is a research topic. Over the lifetime, 26760 publications have been published within this topic receiving 204305 citations. The topic is also known as: data pipeline & computational pipeline.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Dynamic pipelining: making IP-lookup truly scalable

[...]

Jahangir Hasan¹, T. N. Vijaykumar¹•Institutions (1)

Purdue University¹

22 Aug 2005

TL;DR: Compared to previous schemes, this paper shows that SDP is the only scheme that scales well in all the five scalability requirements and achieves scalability in throughput by simultaneously pipelining at the data-structure level and the hardware level.

...read moreread less

Abstract: A truly scalable IP-lookup scheme must address five challenges of scalability, namely: routing-table size, lookup throughput, implementation cost, power dissipation, and routing-table update cost. Though several IP-lookup schemes have been proposed in the past, none of them do well in all the five scalability requirements. Previous schemes pipeline tries by mapping trie levels to pipeline stages. We make the fundamental observation that because this mapping is static and oblivious of the prefix distribution, the schemes do not scale well when worst-case prefix distributions are considered. This paper is the first to meet all the five requirements in the worst case. We propose scalable dynamic pipelining (SDP) which includes three key innovations: (1) We map trie nodes to pipeline stages based on the node height. Because the node height is directly determined by the prefix distribution, the node height succinctly provides sufficient information about the distribution. Our mapping enables us to prove a worst-case per-stage memory bound which is significantly tighter than those of previous schemes. (2) We exploit our mapping to propose a novel scheme for incremental route-updates. In our scheme a route-update requires exactly and only one write dispatched into the pipeline. This route-update cost is obviously the optimum and our scheme achieves the optimum in the worst case. (3) We achieve scalability in throughput by simultaneously pipelining at the data-structure level and the hardware level. SDP naturally scales in power and implementation cost. We not only present a theoretical analysis but also evaluate SDP and a number of previous schemes using detailed hardware simulation. Compared to previous schemes, we show that SDP is the only scheme that scales well in all the five requirements.

...read moreread less

56 citations

Proceedings Article•DOI•

P6 Binary Floating-Point Unit

[...]

Son Dao Trong¹, Martin S. Schmookler¹, Eric M. Schwarz¹, Michael Kroener¹•Institutions (1)

IBM¹

25 Jun 2007

TL;DR: Division and square root algorithms are also described which take advantage of high-precision linear approximation hardware for obtaining a reciprocal or reciprocal square root approximation.

...read moreread less

Abstract: The floating point unit of the next generation PowerPC is detailed. It has been tested at over 5 GHz. The design supports an extremely aggressive cycle time of 13 FO4 using a technology independent measure. For most dependent instructions, its fused multiply-add dataflow has only 6 effective pipeline stages. This is nearly equivalent to its predecessor, the Power 5, even though its technology independent frequency has increased over 70%. Overall the frequency has improved over 100%. It achieves this high performance through aggressive feedback paths, circuit design and layout. The pipeline has 7 stages but data may be fed back to dependent operations prior to rounding and complete normalization. Division and square root algorithms are also described which take advantage of high-precision linear approximation hardware for obtaining a reciprocal or reciprocal square root approximation.

...read moreread less

56 citations

Patent•

Variable delay branch system

[...]

Carole Dulong¹, Jean-Yves Leclerc¹, Patrick Scaglia¹•Institutions (1)

Evans & Sutherland¹

15 Aug 1988

TL;DR: In this article, a branch command initiates an interval of delay which affords prefetching target instructions while using pipeline contents to prevent a pipeline break and avoid lost time, and then a jump or split to the target instructions is performed only if the condition is met.

...read moreread less

Abstract: In a pipeline computer, current instructions executed in sequence are monitored for conditional and unconditional branch commands, as well as the readiness of condition codes, the meeting of branch conditions and split commands. A branch command initiates an interval of delay which affords prefetching target instructions while using pipeline contents to prevent a pipeline break and avoid lost time. Detection of a branch command actuates a register to store a sequence of target instructions. Unless a branch command is conditional, subsequent detection (delayed) of a split command shifts the stored target instructions into operation as the current instructions. For a conditional branch command, a jump or split to the target instructions is performed only if the condition is met. Otherwise the current instruction sequence is restored pending another branch command. Dual instruction registers, program counters and address registers alternate to accommodate branch jumps with considerable time savings by effective programming.

...read moreread less

56 citations

Proceedings Article•DOI•

Stretching the limits of clock-gating efficiency in server-class processors

[...]

Hans M. Jacobson¹, Pradip Bose¹, Zhigang Hu¹, Alper Buyuktosunoglu¹, Victor Zyuban¹, Richard J. Eickemeyer¹, Lee Evan Eisen¹, John Barry Griswell¹, D. Logan¹, Balaram Sinharoy¹, Joel M. Tendler¹ - Show less +7 more•Institutions (1)

IBM¹

12 Feb 2005

TL;DR: This paper examines the realistic benefits and limits of clock-gating in current generation high-performance processors and examines additional opportunities to avoid unnecessary clocking in real workload executions, and examines the power reduction benefits of a couple of newly invented schemes called transparent pipeline clock- gating and elastic pipeline Clock-Gating.

...read moreread less

Abstract: Clock-gating has been introduced as the primary means of dynamic power management in recent high-end commercial microprocessors. The temperature drop resulting from active power reduction can result in additional leakage power savings in future processors. In this paper we first examine the realistic benefits and limits of clock-gating in current generation high-performance processors (e.g. of the POWER4/spl trade/ or POWER5/spl trade/ class). We then look beyond classical clock-gating: we examine additional opportunities to avoid unnecessary clocking in real workload executions. In particular, we examine the power reduction benefits of a couple of newly invented schemes called transparent pipeline clock-gating and elastic pipeline clock-gating. Based on our experiences with current designs, we try to bound the practical limits of clock gating efficiency in future microprocessors.

...read moreread less

56 citations

Patent•

Multiple-mode cryptographic module usable with memory controllers

[...]

Anatoli A. Bolotov¹, Mikhail I. Grinchuk¹, Timothy E. Hoglund¹, Lav D. Ivanovic¹, Paul G. Filseth¹ - Show less +1 more•Institutions (1)

LSI Corporation¹

15 Apr 2010

TL;DR: In this paper, a multi-mode Advanced Encryption Standard (MM-AES) module for a storage controller is adapted to perform interleaved processing of multiple data streams, i.e., concurrently encrypt and/or decrypt string-data blocks from multiple data stream using, for each data stream, a corresponding cipher mode that is any one of a plurality of AES cipher modes.

...read moreread less

Abstract: In one embodiment, a multi-mode Advanced Encryption Standard (MM-AES) module for a storage controller is adapted to perform interleaved processing of multiple data streams, i.e., concurrently encrypt and/or decrypt string-data blocks from multiple data streams using, for each data stream, a corresponding cipher mode that is any one of a plurality of AES cipher modes. The MM-AES module receives a string-data block with (a) a corresponding key identifier that identifies the corresponding module-cached key and (b) a corresponding control command that indicates to the MM-AES module what AES-mode-related processing steps to perform on the data block. The MM-AES module generates, updates, and caches masks to preserve inter-block information and allow the interleaved processing. The MM-AES module uses an unrolled and pipelined architecture where each processed data block moves through its processing pipeline in step with correspondingly moving key, auxiliary data, and instructions in parallel pipelines.

...read moreread less

56 citations

Collapse

Network Information

Performance

Metrics

26,760

Papers

229,716

Citations

No. of papers in the topic in previous years
Year	Papers
2022	18
2021	1,066
2020	1,556
2019	1,793
2018	1,754
2017	1,548

Pipeline (computing)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics