scispace - formally typeset
Search or ask a question

Showing papers on "Pipeline (computing) published in 1990"


Journal ArticleDOI
03 May 1990-Nature
TL;DR: In this paper, a pipeline-based processor was constructed using a "pipeline" architecture to simulate many-body systems with long-range forces, and the architecture can be readily parallelized to make teraflop machines a feasible possibility.
Abstract: A processor has been constructed using a 'pipeline' architecture to simulate many-body systems with long-range forces. It has a speed equivalent to 120 megaflops, and the architecture can be readily parallelized to make teraflop machines a feasible possibility. The machine can be adapted to study molecular dynamics, plasma dynamics and astrophysical hydrodynamics with only minor modifications.

360 citations


Patent
31 Jan 1990
TL;DR: In this paper, the microcode execution unit determines that a new operation is required, an entry is inserted into the result queue, which includes all the information needed by the retire unit to retire the result once the result is available from the respective functional unit.
Abstract: To increase the performance of a pipelined processor executing various classes of instructions, the classes of instructions are executed by respective functional units(164-167) which are independently controlled and operated in parallel. The classes of instructions include integer instructions (164) floating point instructions (165), multiply instructions (166), and divide instructions (161). The integer unit, which also performs shift operations, is controlled by the microcode execution unit (26) to handle the wide variety of integer and shift operations included in a complex, variable-length instruction set. The other functional units need only accept a control command to initiate the operation to be performed by the functional unit. The retiring of the results of the instructions need not be controlled by the microcode execution unit, but instead is delegated to a separate retire unit (173) that services a result queue (172). When the microcode execution unit determines that a new operation is required, an entry is inserted into the result queue. The entry includes all the information needed by the retire unit to retire the result once the result is available from the respective functional unit. The retire unit services the result queue by reading a tag in the entry at the head of the queue to determine the functional unit that is to provide the result. Once the result is available and the destination specified by the entry is also available, the result is retired in accordance with the entry, and the entry is removed from the queue.

191 citations


Journal ArticleDOI
G. F. Grohoski1
TL;DR: The IBM RISC System/6000 processor is a second-generation RISC processor which reduces the execution pipeline penalties caused by branch instructions and also provides high floating-point performance.
Abstract: The IBM RISC System/6000 processor is a second-generation RISC processor which reduces the execution pipeline penalties caused by branch instructions and also provides high floating-point performance. It employs multiple functional units which operate concurrently to maximize the instruction execution rate. By employing these advanced machine-organization techniques, it can execute up to four instructions simultaneously. Approximately 11 MFLOPS are achieved on the LINPACK benchmarks.

149 citations


Patent
29 Jun 1990
TL;DR: In this paper, a register map has a free list of available physical locations in a register file, a log containing a sequential listing of logical registers changed during a predetermined number of cycles, a back-up map associating the logical registers with corresponding physical homes at a backup point in a computer pipeline operation and a predicted map associated the logical register with corresponding locations at a current point in the computer pipeline operations.
Abstract: A register map having a free list of available physical locations in a register file, a log containing a sequential listing of logical registers changed during a predetermined number of cycles, a back-up map associating the logical registers with corresponding physical homes at a back-up point in a computer pipeline operation and a predicted map associating the logical registers with corresponding physical homes at a current point in the computer pipeline operation. A set of valid bits is associated with the maps to indicate whether a particular logical register is to be taken from the back-up map or the predicted map indication of a corresponding physical home. The valid bits can be "flash cleared" in a single cycle to back-up the computer pipeline to the back-up point during a trap event.

82 citations


Journal ArticleDOI
01 May 1990
TL;DR: This paper describes the architecture for issuing multiple instructions per clock in the NonStop Cyclone Processor, which includes cache support for unaligned double-precision accesses, a virtually-addressed main memory, and a novel precise exception mechanism.
Abstract: This paper describes the architecture for issuing multiple instructions per clock in the NonStop Cyclone Processor. Pairs of instructions are fetched and decoded by a dual two-stage prefetch pipeline and passed to a dual six-stage pipeline for execution. Dynamic branch prediction is used to reduce branch penalties. A unique microcode routine for each pair is stored in the large duplexed control store. The microcode controls parallel data paths optimized for executing the most frequent instruction pairs. Other features of the architecture include cache support for unaligned double-precision accesses, a virtually-addressed main memory, and a novel precise exception mechanism.

69 citations


Journal ArticleDOI
TL;DR: In each of these cases, it is shown that the asymptotic processor utilization is independent of the length of the pipeline; thus, linear speedup is achieved.
Abstract: We explore the practical limits on throughput imposed by timing in a long, self-timed, circulating pipeline (ring). We consider models with both fixed and random delays and derive exact results for pipelines where these delays are fixed or exponentially distributed random variables. We also give relationships that provide upper and lower bounds on throughput for any pipeline where the delays are independent random variables. In each of these cases, we show that the asymptotic processor utilization is independent of the length of the pipeline; thus, linear speedup is achieved. We present conditions under which this utilization approaches 100%.

63 citations


Book
01 Jan 1990
TL;DR: Asynchronous processes and their interpretation, as well as the generalization of the Muller theorem, and the modelling of Petri nets, are presented.
Abstract: 1 Introduction- 2 Asynchronous processes and their interpretation- 21 Asynchronous processes- 211 Definition- 212 Some subclasses- 213 Reposition- 214 Structured situations- 215 An asynchronous process as a metamodel- 22 Petri nets- 221 Model description- 222 Some classes- 223 Interpretation- 23 Signal graphs- 24 The Muller model- 25 Parallel asynchronous flow charts- 26 Asynchronous state machines- 27 Reference notations- 3 Self-synchronizing codes- 31 Preliminary definitions- 32 Direct-transition codes- 33 Two-phase codes- 34 Double-rail code- 35 Code with identifier- 36 Optimally balanced code- 37 On the code redundancy- 38 Differential encoding- 39 Reference notations- 4 Aperiodic circuits- 41 Two-phase implementation of finite state machine- 411 Matched implementation- 42 Completion indicators and checkers- 43 Synthesis of combinatorial circuits- 431 Indicatability- 432 Standard implementations- 4321 Minimum form implementation- 4322 Orthogonal form implementation- 4323 Hysteresis flip-flop-based implementation- 4324 Implementation based on "collective responsibility"- 44 Aperiodic flip-flops- 441 Further discussion of flip-flop designs- 4411 RS-flip-flops- 4412 D-flip-flops- 4413 T-flip-flops- 45 Canonical aperiodic implementations of finite state machines- 451 Implementation with delay flip-flops- 452 Implementation using flip-flops with separated inputs- 453 Implementation with complementing flip-flops- 46 Implementation with multiple phase signals- 47 Implementation with direct transitions- 48 On the definition of an aperiodic state machine- 49 Reference notations- 5 Circuit modelling of control flow- 51 The modelling of Petri nets- 511 Event-based modelling- 512 Condition-based modelling- 52 The modelling of parallel asynchronous flow charts- 521 Implementation of standard fragments- 522 A multiple use circuit- 523 A loop control circuit- 524 Using an arbiter- 525 Guard-based implementation- 53 Functional completeness and synthesis of semi-modular circuits- 531 Formulation of the problem- 532 Some properties of semi-modular circuits- 533 Perfect implementation- 534 Simple circuits- 535 The implementation of distributive and totally sequential circuits- 54 Synthesis of semi-modular circuits in limited bases- 55 Modelling pipeline processes- 551 Properties of modelling pipeline circuits- 5511 Pipelinization of parallel fragments- 5512 Pipelinization of a conditional branch- 5513 Transformation of a loop- 5514 Pipelinization for multiply-used sections- 56 Reference notations- 6 Composition of asynchronous processes and circuits- 61 Composition of asynchronous processes- 611 Reinstated process- 612 Process reduction- 613 Process composition- 62 Composition of aperiodic circuits- 621 The Muller theorem- 622 The generalization of the Muller theorem- 63 Algebra of asynchronous circuits- 631 Operations on circuits- 632 Laws and properties- 633 Circuit transformations- 634 Homological algebras of circuits- 64 Reference notations- 7 The matching of asynchronous processes and interface organization- 71 Matched asynchronous processes- 72 Protocol- 73 The matching asynchronous process- 74 The T2 interface- 741 General notations- 742 Communication protocol- 743 Implementation- 75 Asynchronous interface organization- 751 Using the code with identifier- 752 Using the optimally-balanced code- 7521 Half-byte data transfer- 7522 Byte data transfer- 7523 Using non-balanced representation- 76 Reference notations- 8 Analysis of asynchronous circuits and processes- 81 The reachability analysis- 82 The classification analysis- 83 The set of operational states- 84 The effect of non-zero wire delays- 85 Circuit Petri nets- 86 On the complexity of analysis algorithms- 87 Reference notations- 9 Anomalous behaviour of logical circuits and the arbitration problem- 91 Arbiters- 92 Oscillatory anomaly- 93 Meta-stability anomaly- 94 Designing correctly-operating arbiters- 95 "Bounded" arbiters and safe inertial delays- 96 Reference notations- 10 Fault diagnosis and self-repair in aperiodic circuits- 101 Totally self-checking combinational circuits- 102 Totally self-checking sequential machines- 103 Fault detection in autonomous circuits- 104 Self-repair organization for aperiodic circuits- 105 Reference notations- 11 Typical examples of aperiodic design modules- 111 The JK-flip-flop- 112 Registers- 113 Pipeline registers- 1131 Non-dense registers- 1132 Semi-dense pipeline register- 1133 Dense pipeline registers- 1134 One-byte dense pipeline register- 1135 Pipeline register with parallel read-write and the stack- 1136 Reversive pipeline registers- 114 Converting single-rail signals into double-rail ones- 1141 Parallel register with single-rail inputs- 1142 Input and output heads of pipeline registers- 115 Counters- 116 Reference notations- Editor's Epilogue- References

61 citations


Patent
Nariko Suzuki1
28 Aug 1990
TL;DR: In this article, a microprocessor having the instruction decoding operation performed by a precoder unit and a main decoder unit which operates in a pipelined manner by providing a buffer for temporarily storing information from the precoder units positioned between the precoding unit and the main decoding unit.
Abstract: A microprocessor having the instruction decoding operation performed by a precoder unit and a main decoder unit which operates in a pipelined manner by providing a buffer for temporarily storing information from the precoder unit positioned between the precoder unit and the main decoder unit. The microprocessor supports different instruction formats and operand addressing modes without lowering the instruction decoding speed.

60 citations


Patent
09 Apr 1990
TL;DR: In this paper, the authors describe a pipeline consisting of a multi-channel bi-directional video bus, multi-dimensional audio bus, and a digital interprocessor communications bus, where a software driver interconnects the multiple video and audio devices in different configurations.
Abstract: A system (10) has a pipeline (12) comprised of a multi-channel bi-directional video bus (14), multi-channel bi-directional audio bus (16), and a digital interprocessor communications bus (18). The pipeline (12) is equipped with a number of ports (20) where media controller (microprocessor) printed circuit cards (22) can be connected, thus providing a convenient method for connecting media devices (24) to the pipeline (12). In this manner, a media device's video input and output can be optionally connected to any of the video pipes (26) of the video bus (14). Similarly, the media device (24) audio inputs and outputs can be optionally connected to any of the audio bus (16) pipes (26). The switching is accomplished through a pair of analog multiplexers (28) whose connection options have been commanded by local microprocessor (30) resident on the media device microprocessor control board (22). The local microprocessor (30) receives instructions for the pipeline switch interconnections through the interprocessor serial communications bus (18 ). The pipeline (12) is constructed on a motherboard printed circuit board (32) that additionally contains a microprocessor (34) that serves as the local area network controller for the interprocessor communications. A software driver interconnects the multiple video and audio devices (24) in different configurations in response to user inputs to a host data processing system so that physical assignments of the device communications on the pipeline (12) are transparent to the user.

57 citations


Journal ArticleDOI
TL;DR: An analytic model is presented for modeling pipelined data-parallel computation on multicomputers that uses timed Petri nets to describe data pipelining operations and predicts results match closely with the measured performance on a 64-node NCUBE hypercube multicomputer.
Abstract: The basic concept of pipelined data-parallel algorithms is introduced by contrasting the algorithms with other styles of computation and by a simple example (a pipeline image distance transformation algorithm). Pipelined data-parallel algorithms are a class of algorithms which use pipelined operations and data level partitioning to achieve parallelism. Applications which involve data parallelism and recurrence relations are good candidates for this kind of algorithm. The computations are ideal for distributed-memory multicomputers. By controlling the granularity through data partitioning and overlapping the operations through pipelining, it is possible to achieve a balanced computation on multicomputers. An analytic model is presented for modeling pipelined data-parallel computation on multicomputers. The model uses timed Petri nets to describe data pipelining operations. As a case study, the model is applied to a pipelined matrix multiplication algorithm. Predicted results match closely with the measured performance on a 64-node NCUBE hypercube multicomputer. >

52 citations


Proceedings ArticleDOI
12 Mar 1990
TL;DR: This paper presents a new algorithm for the generation of pipelined designs developed for use in an interactive behavioural synthesis system that allows the user to trade-off interactive response time with solution quality.
Abstract: This paper presents a new algorithm for the generation of pipelined designs developed for use in an interactive behavioural synthesis system. The authors' technique uses a novel iterative optimisation algorithm that allows the user to trade-off interactive response time with solution quality. Two examples are given to demonstrate the effectiveness of the approach. These provide comparisons with: (i) an expert designer; and (ii) recently published algorithms for pipeline synthesis. >

Patent
13 Jul 1990
TL;DR: In this article, a processor pipeline control system and method provides a complete set of very simple and very fast pipeline control signals encompassing stalls and interrupts, where exceptional conditions are signaled within the processor by deasserting the LoadX signals required by that exception.
Abstract: A processor pipeline control system and method provides a complete set of very simple and very fast pipeline control signals encompassing stalls and interrupts. Each pipeline stage has associated with it a signal called "LoadX", where X is the pipeline stage name, e.g., LoadID. Instead of signalling exceptional conditions in terms of the event, e.g., "cache miss", exceptional conditions are signalled within the processor by deasserting the LoadX signals required by that exception. When the pipeline control for one pipestage is deasserted, in order to prevent previous instructions from entering the stalled pipestage, the detector of the exceptional condition must deassert all LoadX control signals for stages previous to X as well.

Patent
21 May 1990
TL;DR: In this paper, a mechanism for handling exceptions in a processor system that issues a family of more than one instruction during a single clock that utilizes the exception handling procedures developed for single instructions is presented.
Abstract: A mechanism for handling exceptions in a processor system that issues a family of more than one instruction during a single clock that utilizes the exception handling procedures developed for single instructions. The mechanism detects an exception associated with one of the instructions in the family, inhibits the data writes for the instructions in the family, flushes the pipeline, and reissues the instruction singly. The exception handling procedure for the single instruction may then be utilized.

Patent
Pierre Radochonski1
14 May 1990
TL;DR: In this article, a graphics data processing pipeline is interconnected by a common bus for conveying data and arbitration signals to and from each stage, and each data transmitting stage arbitrates for and acquires control of the bus when it has output data to transmit to an addressable storage location within a next stage.
Abstract: Stages of a graphics data processing pipeline are interconnected by a common bus for conveying data and arbitration signals to and from each stage. Each data transmitting stage arbitrates for and acquires control of the bus when it has output data to transmit to an addressable storage location within a next stage. Each pipeline stage other than a first stage generates a BUSY bit indicating whether it is processing data or awaiting new input data from its preceding stage. When one pipeline stage has output data to transmit to a next pipeline stage, the transmitting stage periodically polls the receiving stage by acquiring control of the bus and placing on the bus a particular address associated with the next stage. Whenever the next stage detects the presence of the particular address on the bus, it places its BUSY bit on the data lines of the bus. When the sending stage determines from the state of the BUSY bit that the next stage is ready to receive input data, the sending stage acquires control of the bus and sends the input data thereon to the next stage.

Patent
30 Oct 1990
TL;DR: In this paper, a method for severing and recovering a submerged pipeline is described, where a deflated lift bag is positioned under the submerged pipeline and then inflated until a section of the pipeline has been raised off of the sea floor.
Abstract: A method for severing and recovering a submerged pipeline is disclosed. The severing and recovering operation may be performed with divers or with a remotely operated vehicle. The deflated lift bag is lowered to the submerged pipeline. The deflated lift bag is positioned under the submerged pipeline and then inflated until a section of the pipeline has been raised off of the sea floor. A cutoff saw is next lowered to the raised section of the pipeline. The cutoff saw is clamped to the pipeline prior to severing the pipeline. The cutoff saw is then removed from the severed pipeline. A recovery head is lowered to the raised end of the severed pipeline. The recovery head is aligned and placed in the raised end of the severed pipeline. The recovery head is activated to establish a gripping relationship with the pipeline. A recovery cable is lowered and connected to the recovery head. The recovery cable is retrieved to recover the recovery head and the pipeline to the water surface.

Proceedings ArticleDOI
30 Nov 1990
TL;DR: A technique for unrolling the loop before pipelined is presented as an improvement to software pipelining, as it can allow Lam's algorithm to achieve time optimality for these restricted loops.
Abstract: Software pipelining can significantly increase the execution rate of loops. Each of the four major software pipelining algorithms takes a different approach to software pipelining. This paper discusses each method and explores some of the similarities and differences among the methods.On loops consisting of a single basic block, the Perfect Pipelining Algorithm [1] is the only software pipelining algorithm that currently achieves time optimality, in the absence of resource constraints. A technique for unrolling the loop before pipelining is presented as an improvement to software pipelining, as it can allow Lam's algorithm [2] to achieve time optimality for these restricted loops. Unrolling has an advantage over Perfect Pipelining because it can reduce the code space required for the software pipeline.

Patent
20 Jul 1990
TL;DR: In this article, a time interval data processing circuit uses a pipelined hardware data processor to perform the conversion of incoming time stamp data into time interval results, which can be further processed into a hardware accumulated histogram or can be compared against limits to determine if a trigger condition has occurred.
Abstract: A time interval data processing circuit uses a pipelined hardware data processor to perform the conversion of incoming time stamp data into time interval results. These results can be further processed into a hardware accumulated histogram or can be compared against limits to determine if a time interval trigger condition has occurred. In the first stage of the pipeline, the processing circuit subtracts the two time stamps from the current and the previous event to determine the time interval between events being measured. The second stage checks the measurement result against minimum and maximum limits and determines which bin the measurement belongs in. The limit testing determines if the measurement fits the histogram limits and also yields the data required to perform measurement triggering on time intervals. The third stage of the pipeline increments the appropriate histogram bin in RAM. The first and third stages of the pipeline are themselves pipelined in substages. To facilitate pipelining in storing the histogram results, the circuit uses dual port RAMs to achieve a fast data accumulation rate. When histogramming, the stored bin data must be incremented each time a new measurement occurs. The third pipeline stage read, increment, write operation is pipelined in substages by adding a latch in the data incrementing loop for the dual port RAM. The latch also provides a way of avoiding access conflicts when the same bin is incremented repeatedly.

Journal ArticleDOI
01 Jan 1990
TL;DR: This paper presents pipeline architectures for the dynamic programming algorithms for the knapsack problems that enable us to achieve an optimal speedup using processor arrays, queues, and memory modules.
Abstract: Dynamic programming is one of the most powerful approaches to many combinatorial optimization problems. In this paper, we present pipeline architectures for the dynamic programming algorithms for the knapsack problems. They enable us to achieve an optimal speedup using processor arrays, queues, and memory modules. The processor arrays can be regarded as pipelines where the dynamic programming algorithms are implemented through pipelining.

Patent
16 Nov 1990
TL;DR: In this paper, an improved multi-channel direct memory access (DMA) controller for data processing systems provides adaptive pipelining and time overlapping of operations performed relative to communication channels, which increases the effective rate of transfer at the bus interface to system memory and thereby allows for the controller to be used for applications in which throughput requirements and bus access constraints could otherwise conflict.
Abstract: An improved multi-channel direct memory access (DMA) controller for data processing systems provides adaptive pipelining and time overlapping of operations performed relative to communication channels. Registers and resources used to pipeline communication data and control signals relative to plural channels are adaptively shared relative to a single channel when command chaining is required relative to that channel. In command chaining a plural word command, termed a Device Control Block (DCB), is fetched from an external system memory via a bus having severe time constraints relative to potential real time requirements of the channels. Pipelining and time overlapping of channel operations, relative to plural channels, increases the effective rate of transfer at the bus interface to the system memory, and thereby allows for the controller to be used for applications in which throughput requirements and bus access constraints could otherwise conflict.

Patent
06 Feb 1990
TL;DR: In this article, the instructions in a sequence may be divided into groups by using either taken-branch instructions or certain instructions which may change the contents of the general purpose registers as group delimiters.
Abstract: A digital computer includes a main and an auxiliary pipeline processor which are configured to concurrently execute contiguous groups of instructions taken from a single instruction sequence. The instructions in a sequence may be divided into groups by using either taken-branch instructions or certain instructions which may change the contents of the general purpose registers as group delimiters. Both methods of grouping the instructions use a branch history table to predict the sequence in which the instructions will be executed.

Patent
Ralph M. Begun1
11 Jun 1990
TL;DR: In this paper, a data processing system includes a microprocessor operable in a burst mode to read data from a memory, its controller and bus are operating in a pipelining mode, and array logic is connected between the microprocessor and the remaining elements for converting from burst to pipeline mode.
Abstract: A data processing system includes a microprocessor operable in a burst mode to read data from a memory. The memory, its controller and bus are operable in a pipelining mode. Array logic is connected between the microprocessor and the remaining elements for converting the burst mode to the pipeline mode.

Journal ArticleDOI
TL;DR: In this paper, the authors developed a theory to compute the statistical properties for the response of a surface-mounted pipeline to seismic excitations, making the assumptions that the pipeline behaves like an infinitely long Euler-Bernoulli beam on evenly spaced supports, and that the seismic motion is statistically stationary in time but nonhomogeneous in space.
Abstract: This paper develops a theory to compute the statistical properties for the response of a surface-mounted pipeline to seismic excitations. Making the assumptions that the pipeline behaves like an infinitely long Euler-Bernoulli beam on evenly spaced supports, that the seismic motion is statistically stationary in time but nonhomogeneous in space, and that the seismic inputs are fed into the pipeline system through the supports, solutions are obtained for the spectral densities of the structural response. The theory developed is aplicable to any spectral distributions.

Book ChapterDOI
TL;DR: A dynamic programming based Gas Pipeline Optimizer (GPO) has been developed at Scientific Software-Intercomp for the HBJ gas transmission pipeline system in India to satisfy specified delivery flow rate and minimum delivery pressure requirements at the receiving terminals.
Abstract: A dynamic programming based Gas Pipeline Optimizer (GPO) has been developed at Scientific Software-Intercomp for the HBJ gas transmission pipeline system in India. Used as an operating and planning tool, the GPO will determine the discharge pressures at the compressor stations and the number of compressor trains to operate at each compressor station so that fuel consumption and start-up/shut-down costs for the entire HBJ system are minimized under steady state conditions. Further, the optimization will satisfy specified delivery flow rate and minimum delivery pressure requirements at the receiving terminals and ensure that minimum line pack (inventory) requirements are met for each section of the pipeline. Excessive starting and stopping of compressor trains will also be avoided.

Patent
15 Jun 1990
TL;DR: In this paper, a system and technique for providing early decoding of complex instructions in a pipelined processor uses a programmed logic array to decode instruction segments and loads both the instruction bits and the associated predecoded bits into a FIFO buffer to accumulate a plurality of such entries.
Abstract: A system and technique for providing early decoding of complex instructions in a pipelined processor uses a programmed logic array to decode instruction segments and loads both the instruction bits and the associated predecoded bits into a FIFO buffer to accumulate a plurality of such entries. Meanwhile, an operand execute pipeline retrieves such entries from the FIFO buffer as needed, using the predecoded instruction bits to rapidly decode and execute the instructions at rates determined by the instructions themselves. Delays due to cache misses are substantially or entirely masked, as the instructions and associated predecoded bits are loaded into the FIFO buffer more rapidly than they are retrieved from it, except during cache misses. A method is described for increasing the effective speed of executing a three operand construct. Another method is disclosed for increasing the effective speed of executing a loop containing a branch instruction by scanning the predecoded bits in establishing a link between successive instructions.

Journal ArticleDOI
TL;DR: A system design/scheduling strategy is described for a real-time parallel image processing system based on the spatial and temporal parallelisms extractable in image processing tasks and the strategy needs to be modified to achieve the maximum possible processing speed.
Abstract: A system design/scheduling strategy is described for a real-time parallel image processing system. A parallel image processing model based on the spatial and temporal parallelisms extractable in image processing tasks is formulated. This model consists of linear pipeline stages, each of which is a multiprocessing module. The strategy is discussed for two cases: static and dynamic. The description of the detailed hardware structure of each module is not attempted because of its dependency on a specific task. In the static case, where the processing time is constant, the processing times of all the stages are adjusted to be identical. In the dynamic case, where the processing time varies with each image, the strategy needs to be modified to achieve the maximum possible processing speed. The strategy is demonstrated for the static case by implementing image processing tasks on two different multiprocessor systems. >

Proceedings ArticleDOI
16 Jun 1990
TL;DR: The segmented pipeline architecture is illustrated with an application to image motion analysis and efficiency is maintained by switching processing elements between segments as image data flows through the system.
Abstract: A segmented pipeline architecture for multiresolution, focal, array processing is presented. A buffer is introduced at each point in a pipeline computation at which changes in sample density or analysis area may take place. These buffers divide the pipeline into segments, each with constant data load. When active, a segment runs at its full design rate. Efficiency is maintained by switching processing elements between segments as image data flows through the system. The segmented pipeline architecture is illustrated with an application to image motion analysis. >

Patent
25 Jun 1990
TL;DR: In this paper, a flow meter is provided comprising a parallel combination of a solenoid value and a pressure differential switch connected across the shut-off valve, where the pressure supply pump remains ON during test procedure and pressure on both sides of a closed shutoff valve in the pipeline is maintained equal.
Abstract: A method for detecting leaks in a fluid pipeline where the pressure supply pump remains ON during the test procedure and pressure on both sides of a closed shut-off valve in the pipeline is maintained equal. A novel flow meter is provided comprising a parallel combination of a solenoid value and a pressure differential switch connected across the shut-off valve.

Patent
29 May 1990
TL;DR: In this paper, a videophone system for providing videophone service within narrow band digital network, based on pipelined processing elements, includes source codec means coupled to the image bus, including numerous processing elements of which processing element has DSP (Digital Signal Processor) module, local memory module coupled to DSP module, FIFO (First Input First Output) memory module and image bus interface module, wherein communications between the processor elements is performed via the pipeline and common memory means coupled with the VME bus and the image buses, for storing message data for synchronization and communication
Abstract: A videophone system for providing videophone service within narrow band digital network, based on pipelined processing elements, includes source codec means coupled to the image bus, including numerous processing elements of which processing element has DSP (Digital Signal Processor) module, local memory module coupled to the DSP module, FIFO (First Input First Output) memory module coupled to the DSP module and the local memory module, and image bus interface module coupled to the DSP module and the local memory module and the image bus, wherein communications between the processor elements is performed via the pipeline and common memory means coupled to the VME bus and the image bus, for storing message data for synchronization and communication between the processing elements.

Patent
10 Aug 1990
TL;DR: In this paper, a cascade of four Butterfly Arithmetic Units (BAUs) are coupled to each BAU in the cascade, and a multiplexed memory arrangement operated in a "ping-pong" manner stores four-stage partial FFT signal as it is generated, and returns it to the first BAU for subsequent passes to generate multiple-stage FFT signals.
Abstract: A pipeline Fast Fourier Transform arrangement includes a cascade of four Butterfly Arithmetic Units (BAU). Weighting and control signals are coupled to each BAU in the cascade. A multiplexed memory arrangement operated in a "ping-pong" manner stores four-stage partial FFT signal as it is generated, and returns it to the first BAU in the cascade for subsequent passes to generate multiple-stage FFT signals. Each BAU includes local memories cyclically fed and addressed to generate temporal offsets, which differ among the processors of the cascade.

Patent
Jacob K. White1, Jacob K. White1
07 Dec 1990
TL;DR: In this article, a pipeline data processor is simultaneously operable in a pipeline mode, a parallel mode and a vector mode which is a special case of the pipeline mode and each pipeline stage has its own stage program counter.
Abstract: A pipeline data processor is simultaneously operable in a pipeline mode, a parallel mode and a vector mode which is a special case of the pipeline mode. Each pipeline stage has its own stage program counter. A global program counter is incremented in the pipeline mode. The instruction addresses generated in the global program counter are distributed to those pipeline stages which first become available to perform pipelined data processing. Any given pipeline stage may dynamically switch between pipeline mode and a parallel mode in which the stage program counter counts and supplies instruction addresses independently of any other pipeline stage. A vector mode uses pipeline instructions which are repeated to enable any number of the pipeline stages to participate in vector calculations. In the vector mode, one pipeline instruction address is held in the global program counter to be repeatedly supplied to respective first available pipeline stages until the vector calculations are completed. Other program means are disclosed which effect efficient control of program executions in all three modes and enable the concurrent execution in any mode by any of the pipeline stages.