Topic

Pipeline (computing)

About: Pipeline (computing) is a research topic. Over the lifetime, 26760 publications have been published within this topic receiving 204305 citations. The topic is also known as: data pipeline & computational pipeline.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Timing constraints for wave-pipelined systems

[...]

C.T. Gray¹, Wentai Liu², Ralph K. Cavin²•Institutions (2)

Research Triangle Park¹, North Carolina State University²

01 Aug 1994-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A timing constraint formulation for the correct clocking of wave-pipelined systems and implications and motivations for the use of accurate delay models and exact timing analysis in the determination of combinational logic delays are given.

...read moreread less

Abstract: Wave-pipelining is a timing methodology used in digital systems to achieve maximal rate operation. Using this technique, new data are applied to the inputs of a combinational block before the previous outputs are available, thus effectively pipelining the combinational logic and maximizing the utilization of the logic without inserting registers. This paper presents a timing constraint formulation for the correct clocking of wave-pipelined systems. Both single- and multiple-stage systems including feedback are considered. Based on the formulation of this paper, several important new results are presented relating to performance limits of wave-pipelined circuits. These results include the specification of distinct and disjoint regions of valid operation dependent on the clock period, intentional clock skew, and the global clock latency. Also, implications and motivations for the use of accurate delay models and exact timing analysis in the determination of combinational logic delays are given, and an analogous relationship between the multi-stage system and the single-stage system in terms of performance limits is shown. The minimum clock period is obtained by clock skew optimization formulated as a linear program. In addition, important special cases are examined and their relative performance limits are analyzed. >

...read moreread less

61 citations

Book•

Self-Timed Control of Concurrent Processes: The Design of Aperiodic Logical Circuits in Computers and Discrete Systems

[...]

M. A. Kishinevskiµi, Alexandre V. Yakovlev, Victor I. Varshavsky

01 Jan 1990

TL;DR: Asynchronous processes and their interpretation, as well as the generalization of the Muller theorem, and the modelling of Petri nets, are presented.

...read moreread less

Abstract: 1 Introduction- 2 Asynchronous processes and their interpretation- 21 Asynchronous processes- 211 Definition- 212 Some subclasses- 213 Reposition- 214 Structured situations- 215 An asynchronous process as a metamodel- 22 Petri nets- 221 Model description- 222 Some classes- 223 Interpretation- 23 Signal graphs- 24 The Muller model- 25 Parallel asynchronous flow charts- 26 Asynchronous state machines- 27 Reference notations- 3 Self-synchronizing codes- 31 Preliminary definitions- 32 Direct-transition codes- 33 Two-phase codes- 34 Double-rail code- 35 Code with identifier- 36 Optimally balanced code- 37 On the code redundancy- 38 Differential encoding- 39 Reference notations- 4 Aperiodic circuits- 41 Two-phase implementation of finite state machine- 411 Matched implementation- 42 Completion indicators and checkers- 43 Synthesis of combinatorial circuits- 431 Indicatability- 432 Standard implementations- 4321 Minimum form implementation- 4322 Orthogonal form implementation- 4323 Hysteresis flip-flop-based implementation- 4324 Implementation based on "collective responsibility"- 44 Aperiodic flip-flops- 441 Further discussion of flip-flop designs- 4411 RS-flip-flops- 4412 D-flip-flops- 4413 T-flip-flops- 45 Canonical aperiodic implementations of finite state machines- 451 Implementation with delay flip-flops- 452 Implementation using flip-flops with separated inputs- 453 Implementation with complementing flip-flops- 46 Implementation with multiple phase signals- 47 Implementation with direct transitions- 48 On the definition of an aperiodic state machine- 49 Reference notations- 5 Circuit modelling of control flow- 51 The modelling of Petri nets- 511 Event-based modelling- 512 Condition-based modelling- 52 The modelling of parallel asynchronous flow charts- 521 Implementation of standard fragments- 522 A multiple use circuit- 523 A loop control circuit- 524 Using an arbiter- 525 Guard-based implementation- 53 Functional completeness and synthesis of semi-modular circuits- 531 Formulation of the problem- 532 Some properties of semi-modular circuits- 533 Perfect implementation- 534 Simple circuits- 535 The implementation of distributive and totally sequential circuits- 54 Synthesis of semi-modular circuits in limited bases- 55 Modelling pipeline processes- 551 Properties of modelling pipeline circuits- 5511 Pipelinization of parallel fragments- 5512 Pipelinization of a conditional branch- 5513 Transformation of a loop- 5514 Pipelinization for multiply-used sections- 56 Reference notations- 6 Composition of asynchronous processes and circuits- 61 Composition of asynchronous processes- 611 Reinstated process- 612 Process reduction- 613 Process composition- 62 Composition of aperiodic circuits- 621 The Muller theorem- 622 The generalization of the Muller theorem- 63 Algebra of asynchronous circuits- 631 Operations on circuits- 632 Laws and properties- 633 Circuit transformations- 634 Homological algebras of circuits- 64 Reference notations- 7 The matching of asynchronous processes and interface organization- 71 Matched asynchronous processes- 72 Protocol- 73 The matching asynchronous process- 74 The T2 interface- 741 General notations- 742 Communication protocol- 743 Implementation- 75 Asynchronous interface organization- 751 Using the code with identifier- 752 Using the optimally-balanced code- 7521 Half-byte data transfer- 7522 Byte data transfer- 7523 Using non-balanced representation- 76 Reference notations- 8 Analysis of asynchronous circuits and processes- 81 The reachability analysis- 82 The classification analysis- 83 The set of operational states- 84 The effect of non-zero wire delays- 85 Circuit Petri nets- 86 On the complexity of analysis algorithms- 87 Reference notations- 9 Anomalous behaviour of logical circuits and the arbitration problem- 91 Arbiters- 92 Oscillatory anomaly- 93 Meta-stability anomaly- 94 Designing correctly-operating arbiters- 95 "Bounded" arbiters and safe inertial delays- 96 Reference notations- 10 Fault diagnosis and self-repair in aperiodic circuits- 101 Totally self-checking combinational circuits- 102 Totally self-checking sequential machines- 103 Fault detection in autonomous circuits- 104 Self-repair organization for aperiodic circuits- 105 Reference notations- 11 Typical examples of aperiodic design modules- 111 The JK-flip-flop- 112 Registers- 113 Pipeline registers- 1131 Non-dense registers- 1132 Semi-dense pipeline register- 1133 Dense pipeline registers- 1134 One-byte dense pipeline register- 1135 Pipeline register with parallel read-write and the stack- 1136 Reversive pipeline registers- 114 Converting single-rail signals into double-rail ones- 1141 Parallel register with single-rail inputs- 1142 Input and output heads of pipeline registers- 115 Counters- 116 Reference notations- Editor's Epilogue- References

...read moreread less

61 citations

Posted Content•

DAPPLE: A Pipelined Data Parallel Approach for Training Large Models

[...]

Shiqing Fan¹, Yi Rong¹, Chen Meng¹, Zongyan Cao¹, Siyu Wang¹, Zhen Zheng¹, Chuan Wu², Guoping Long¹, Jun Yang¹, Lixue Xia¹, Lansong Diao¹, Xiaoyong Liu¹, Wei Lin¹ - Show less +9 more•Institutions (2)

Alibaba Group¹, University of Hong Kong²

02 Jul 2020-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: DAPPLE, a synchronous training framework which combines data parallelism and pipeline parallelism for large DNN models, is proposed, which features a novel parallelization strategy planner to solve the partition and placement problems, and explores the optimal hybrid strategies of data and pipeline Parallelism.

...read moreread less

Abstract: It is a challenging task to train large DNN models on sophisticated GPU platforms with diversified interconnect capabilities. Recently, pipelined training has been proposed as an effective approach for improving device utilization. However, there are still several tricky issues to address: improving computing efficiency while ensuring convergence, and reducing memory usage without incurring additional computing costs. We propose DAPPLE, a synchronous training framework which combines data parallelism and pipeline parallelism for large DNN models. It features a novel parallelization strategy planner to solve the partition and placement problems, and explores the optimal hybrid strategy of data and pipeline parallelism. We also propose a new runtime scheduling algorithm to reduce device memory usage, which is orthogonal to re-computation approach and does not come at the expense of training throughput. Experiments show that DAPPLE planner consistently outperforms strategies generated by PipeDream's planner by up to 3.23x under synchronous training scenarios, and DAPPLE runtime outperforms GPipe by 1.6x speedup of training throughput and reduces the memory consumption of 12% at the same time.

...read moreread less

61 citations

Patent•

Asynchronous, high-bandwidth memory component using calibrated timing elements

[...]

Frederick A. Ware¹, Ely K. Tsern¹, Craig E. Hampel¹, Donald C. Stark¹•Institutions (1)

Rambus¹

30 Jan 2002

TL;DR: In this paper, an asynchronous memory device that uses internal delay elements to enable memory access pipelining is described. But, the delay elements are responsive to an input load control signal, and are calibrated with reference to periodically received timing pulses.

...read moreread less

Abstract: Disclosed herein are embodiments of an asynchronous memory device that use internal delay elements to enable memory access pipelining. In one embodiment, the delay elements are responsive to an input load control signal, and are calibrated with reference to periodically received timing pulses. Different numbers of the delay elements are configured to produce different asynchronous delays and to strobe sequential pipeline elements of the memory device.

...read moreread less

61 citations

Journal Article•DOI•

A tnGAN-Based Leak Detection Method for Pipeline Network Considering Incomplete Sensor Data

[...]

Xuguang Hu¹, Huaguang Zhang¹, Dazhong Ma¹, Rui Wang¹•Institutions (1)

Northeastern University (China)¹

01 Jan 2021-IEEE Transactions on Instrumentation and Measurement

TL;DR: In this article, a generative adversarial network based on trinetworks form (tnGAN) is proposed to handle leak detection problems with incomplete sensor data, which can achieve different incomplete data recovery situations, such as individual lost and random missing.

...read moreread less

Abstract: Due to the widely deployed sensors in the pipeline network, the data-driven detection method is a natural choice with multiple sensor measurements. However, the incomplete data problem caused by device failure or network interruption seriously hinders the implementation of pipeline status monitoring. Aiming at this difficulty, this article proposes a generative adversarial network based on trinetworks form (tnGAN) to handle leak detection problems with incomplete sensor data. First, the generative model is proposed to recover incomplete data through fully exploiting the same-level nature similarity of data features. Therein, the same type of sensor data, obtained from the pipeline network, is used as the input. Next, to further boost the temporal evolvement characteristics and the spatial similarity, a multiview awareness strategy is incorporated in the established model to facilitate the integration of inherent information. Then, a dual-discriminative network architecture is proposed to detect pipeline status through computing the similarity of the latent features of samples. With the abovementioned structure, the proposed method can achieve different incomplete data recovery situations, such as individual lost and random missing. In addition, it can also aggregate the output and features of the discriminative networks to obtain the pipeline leak detection result. Finally, the experiment results on a pipeline network demonstrate the capability and effectiveness of the proposed method in both data recovery and leak detection.

...read moreread less

61 citations

Collapse

Network Information

Performance

Metrics

26,760

Papers

229,716

Citations

No. of papers in the topic in previous years
Year	Papers
2022	18
2021	1,066
2020	1,556
2019	1,793
2018	1,754
2017	1,548

Pipeline (computing)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics