Showing papers on "Pipeline (computing) published in 1983"

PDF

Open Access

Journal Article•DOI•

Postpass Code Optimization of Pipeline Constraints

[...]

John L. Hennessy¹, Thomas Gross¹•Institutions (1)

01 Jul 1983-ACM Transactions on Programming Languages and Systems

TL;DR: The basic problem of reorganization of machine-level instructions at compile time is shown to be np-complete, and a heuristic algorithm is proposed, and its properties and effectiveness are explored.

...read moreread less

Abstract: Pipeline interlocks are used in a pipelined architecture to prevent the execution of a machine instruction before its operands are available. An alternative to this complex piece of hardware is to rearrange the instructions at compile time to avoid pipeline interlocks. This problem is called code reorganization and is studied. The basic problem of reorganization of machine-level instructions at compile time is shown to be np-complete. A heuristic algorithm is proposed, and its properties and effectiveness are explored. Empirical data from MIPs, a VLSI processor design, are given. The impact of code reorganization techniques on the rest of a compiler system is discussed. 30 references.

...read moreread less

250 citations

Patent•

(k)-Instructions-at-a-time pipelined processor for parallel execution of inherently sequential instructions

[...]

Tangu Hao Shii¹, Huei Ling¹, Howard E. Sachar¹, Jeffrey Weiss¹, Yannis John Yamour¹ - Show less +1 more•Institutions (1)

IBM¹

14 Mar 1983

TL;DR: In this paper, a secondary data flow facility with additional capability, to emulate the simultaneous processing of the prerequisite instruction and the dependent instruction, is proposed to improve simultaneous pipeline processing of inherently sequential instructions (k)-at-a-time, by eliminating delays for calculating prerequisite operands.

...read moreread less

Abstract: Equipping a secondary data flow facility with additional capability, to emulate for certain operations the simultaneous processing of the prerequisite instruction and the dependent instruction, significantly improves simultaneous pipeline processing of inherently sequential instructions (k)-at-a-time, by eliminating delays for calculating prerequisite operands. For example, Instruction A+B=Z1 followed by Instruction Z1+C=Z2 is inherently sequential, with A+B=Z1 the prerequisite instruction and Z1+C=Z2 the dependent instruction. The specially equipped secondary data flow facility does not wait for Z1, the apparent input operand from the prerequisite instruction; it simulates Z1 instead, performing A+B+C=Z2 in parallel with A+B=Z1. All data flow facilities need not be fully equipped for all instructions; the secondary data flow facility may be generally less massive than a primary data flow facility, but is more sophisticated in a critical organ, such as the adder. The three-input adder of the secondary data flow facility emulates the result of a two-input adder of a primary data flow facility, occuring simultaneously in the two-input primary data flow facility adder, adding the third operand to the emulated result, without delay. The instruction unit decodes the instruction sequence normally to control (k)-at-a-time execution where there are no instruction interlocks or dependencies; to delay execution of dependent instructions until operands become available; and to reinstate (k)-at-a-time execution in a limited number of cases by using the additional capability of the secondary data flow facility to emulate the prerequisite operands. A control unit performs housekeeping to execute the instructions.

...read moreread less

137 citations

Fault-Tolerance and Two-Level Pipelining in VLSI Systolic Arrays

[...]

Hsiang-Tsung Kung, Monica S. Lam¹•Institutions (1)

Carnegie Mellon University¹

01 Nov 1983

TL;DR: The authors address two important issues in systolic array designs: fault-tolerance and two-level pipelining and show that both problems can be reduced to the same mathematical problem of incorporating extra delays on certain data paths in originally correct syStolic designs.

...read moreread less

Abstract: The authors address two important issues in systolic array designs: fault-tolerance and two-level pipelining. The proposed systolic fault-tolerant scheme maintains the original data flow pattern by bypassing defective cells with a few registers. As a result, many of the desirable properties of systolic arrays (such as local and regular communication between cells) are preserved. Two-level pipelining refers to the use of pipelined functional units in the implementation of systolic cells. The authors paper addresses the problem of efficiently utilizing pipelined units to increase the overall system throughput. They show that both of these problems can be reduced to the same mathematical problem of incorporating extra delays on certain data paths in originally correct systolic designs. They introduce the mathematical notion of a cut which enables them to handle this problem effectively. The results obtained by applying the techniques described are encouraging. When applied to systolic arrays without feedback cycles, the arrays can tolerate large numbers of failures (with the addition of very little hardware) while maintaining the original throughput. Furthermore, all of the pipeline stages in the cells can be kept fully utilized through the addition of a small number of delay registers. However, adding delays to systolic arrays with more » cycles typically induces a significant decrease in throughput. In response to this, they have derived a new class of systolic algorithms in which the data cycle around a ring of processing cells. The systolic ring architecture has the property that its performance degrades gracefully as cells fail. Using the cut theory for arrays without feedback and the ring architecture approach for those with feedback, they have effective fault-tolerant and two-level pipelining schemes for most systolic arrays. 24 references. « less

...read moreread less

114 citations

Patent•

Full floating point vector processor with dynamically configurable multifunction pipelined ALU

[...]

John B. Porter¹, David W. Altmann¹, Bruno A. Mattedi¹, Ralph Jones¹•Institutions (1)

Analogic Corporation¹

27 May 1983

TL;DR: In this paper, the address generator, the pipeline control sequencer, and the master processing unit are configured in parallel, and a sign latch micro-instruction control is operative to provide the arithmetic and logical unit with a data dependent decision making capability.

...read moreread less

Abstract: A full floating point vector processor includes a master processing unit having DMA I/O means, a wide bandwidth data memory having static RAM and/or interleaved dynamic RAM, an address generator operative to provide address generation for data loaded in the data memory, a concurrently operating pipeline control sequencer operative to provide fully programmable horizontal format microinstructions synchronously with the addresses generated by the address generator, and a pipelined arithmetic and logical unit responsive to the addressed data and to the synchronously provided microinstructions and operative to evaluate one of a user selectable plurality of computationally intensive functions. The address generator, the pipeline controlsequencer, and the master processing unit are configured in parallel. The address generator includes means operative to provide pipeline input and output data dependent address generation. The microinstruction controlled pipelined arithmetic and logical unit includes two register files controllably interconnectable over feedforward and feedback data flow paths, a user selectable fixed or floating point format multiplier, a user selectable fixed or floating point format arithmetic and logical unit, and a sign latch coupled between the arithmetic and logical unit and one of the register files. The sign latch microinstruction control is operative to provide the arithmetic and logical unit with a data dependent decison making capability. A microinstruction controlled write address FIFO and a read address FIFO are coupled to the data memory.

...read moreread less

85 citations

Proceedings Article•DOI•

Performance measurements on HEP - a pipelined MIMD computer

[...]

Harry F. Jordan

13 Jun 1983

TL;DR: Although no direct comparisons are made with other computers, the low pipeline idle time in this machine indicates that this architectural technique may be more beneficial in an MIMD machine than in either SISD or SIMD machines.

...read moreread less

Abstract: A pipelined implementation of MIMD operation is embodied in the HEP computer This architectural concept should be carefully evaluated now that such a computer is available commercially This paper studies the degree of utilization of pipelines in the MIMD environment A detailed analysis of two extreme cases indicates that pipeline utilization is quite high Although no direct comparisons are made with other computers, the low pipeline idle time in this machine indicates that this architectural technique may be more beneficial in an MIMD machine than in either SISD or SIMD machines

...read moreread less

61 citations

Journal Article•DOI•

The Microcomputer and Pipeline Transients

[...]

E. Benjamin Wylie

01 Dec 1983-Journal of Hydraulic Engineering

TL;DR: A number of concepts related to computer studies of transient flow in pipeline systems are addressed in this paper, including organizational concepts for system data handling, and ideas to realize certain storage and computational efficiencies when using the method of characteristics as the computational procedure.

...read moreread less

Abstract: A number of concepts related to computer studies of transient flow in pipeline systems are addressed. The topics are directed to applications on microcomputers, but are not limited to that special purpose. The topics include organizational concepts for system data handling, and ideas to realize certain storage and computational efficiencies when using the method of characteristics as the computational procedure. Alternatives to the method of specified time intervals, namely a staggered grid and an algebraic treatment, are discussed. A simple improved modification in the friction term is also emphasized. Two common elements in hydraulic systems that influence the form of pressure waves, series connections and lossy elements, are focused upon with a view to providing an improved visualization of their response characteristics. Equations and graphs are presented for cases with and without initial through flow. Finally, an example of the failure of a physical system is presented to emphasize the importance of unsteady flow visualization in hydraulic system design.

...read moreread less

43 citations

Patent•DOI•

Computer with console addressable PLA storing control microcode and microinstructions for self-test of internal registers and ALU

[...]

Gary R. Burke¹•Institutions (1)

Fairchild Semiconductor International, Inc.¹

23 May 1983-Microelectronics Reliability

TL;DR: The PLA contains microcode which enables programs in external memory to be loaded from any location in memory and run by command from the console as well as to enable the operator to halt user program execution, read the pertinent internal registers, and then continue program execution such that single step execution for debug purposes is possible.

...read moreread less

42 citations

Patent•

Arithmetic system having pipeline structure arithmetic means

[...]

Masahiro Hashimoto¹, Watanabe Takeshi¹, Kenichi Wada¹•Institutions (1)

Hitachi¹

28 Jun 1983

TL;DR: In this paper, an arithmetic system includes an arithmetic unit of a pipeline structure for executing arithmetic operations for instructions which require different arithmetic cycles, and the arithmetic unit executes N arithmetics in pipeline for N instruction at maximum.

...read moreread less

Abstract: An arithmetic system includes an arithmetic unit of a pipeline structure for executing arithmetic operations for instructions which require different arithmetic cycles. The arithmetic unit executes N arithmetics in pipeline for N instruction at maximum. Initiation of arithmetic operation for a new instruction in the arithmetic unit is indicated by an indicator which detects that each of the instruction executed in the arithmetic is N cycles before completion of the execution and allows arithmetic operation for the new instruction to be initiated in the succeeding cycle.

...read moreread less

35 citations

Proceedings Article•

Maximum Pipelining of Array Operations on Static Data Flow Machine.

[...]

Jack B. Dennis, Guang R. Gao

01 Jan 1983

TL;DR: The authors show that, for certain programs in the VAL language, it is possible to construct machine-level data flow programs that support fully pipelined computation.

...read moreread less

Abstract: Data flow computers are a radical departure from conventional computer architecture, and new methodologies are required for generating efficient machine-level programs from high level user programming languages. The authors show that, for certain programs in the VAL language, it is possible to construct machine-level data flow programs that support fully pipelined computation. A VAL program in the class considered consists of blocks of code each of which defines a new array value either by a forall expression in which each element may be computed independently, or by a for-iter expression that defines array elements by a first-order recurrence relation. 7 references.

...read moreread less

35 citations

Proceedings Article•DOI•

DDDP-a Distributed Data Driven Processor

[...]

Masasuke Kishi, Hiroshi Yasuhara, Yasusuke Kawamura

13 Jun 1983

TL;DR: The experimental system adopts a low key technology and yet is capable of executing about 0.7 million instructions per second through the benchmarks, implying that data flow computers can be alternative to the conventional von-Neumann computers if state-of-the-art technologies are adequately introduced.

...read moreread less

Abstract: This paper describes an architecture of a data flow computer named the Distributed Data Driven Processor (DDDP), and presents an experimental system and the results of experiments using several benchmarks. The experimental system has four processing elements connected by a ring bus, and a structured data memory. The main features of our system are that each processing element is provided with a hardware hashing mechanism to implement token coloring, and a ring bus is used to pass tokens concurrently among processing elements. A hardware monitor was used to measure the performance of the experimental system. The experimental system adopts a low key technology and yet is capable of executing about 0.7 million instructions per second through the benchmarks. This implies that data flow computers can be alternative to the conventional von-Neumann computers if state-of-the-art technologies are adequately introduced.

...read moreread less

33 citations

Patent•

Pipeline processing apparatus having a test function

[...]

Nukiyama Tomoji¹•Institutions (1)

NEC¹

31 May 1983

TL;DR: In this article, a pipeline bus serially connects pipeline stages such that input data supplied through an input unit can be serially transported through the several pipeline stages and finally to an output unit.

...read moreread less

Abstract: A pipeline processing apparatus has a plurality of pipeline stages, each stage including a pipeline latch and a pipeline processing circuit. A pipeline bus serially connects the several pipeline stages such that input data supplied through an input unit can be serially transported through the several pipeline stages and finally to an output unit. To facilitate testing the pipeline processing apparatus and specifically the individual pipeline stages and the data passing through these individual stages independently of the pipeline processing cycle, there is provided a common bus coupled to the input unit, the output unit and selectively to each of the pipeline stages. A designated pipeline stage is selectively coupled to the common bus and to cause test data to be supplied to the designated pipeline stage and subsequently read out from the designated stage.

...read moreread less

Interconnection networks and compiler algorithms for multiprocessors

[...]

Kyungsook Yoon Lee

01 Jan 1983

TL;DR: A new rearrangeability proof for interconnection networks is developed, with the same lower bound hardware requirement as the Benes network but for a general configuration, that can be applied to any lower bound rearrangeable interconnection network.

...read moreread less

Abstract: In this thesis the rearrangeability of interconnection networks and the data movement between the global memory and the processor local memory are studied. A new rearrangeability proof for interconnection networks is developed, with the same lower bound hardware requirement as the Benes network but for a general configuration. This new proof technique is universal, in the sense that it can be applied to any lower bound rearrangeable interconnection network. It is also a constructive proof which yields a control algorithm. Another problem studied is the effect of global delays on system speed, caused by the traffic between local memory and global memory in parallel processor systems. The memory bandwidth, memory conflicts and interconnection conflicts contribute to global delays. A Prefetch/Execute/Poststore pipeline is introduced to reduce the performance degradation due to global delays for innermost vector loops. The analyzing vectorizer PARAFRASE is used to measure the speedup loss on 31 scientific programs with and without the pipeline.

...read moreread less

Journal Article•DOI•

A single chip digital signal processor and its application to real-time speech analysis

[...]

Yoshimune Hagiwara¹, Y. Kita, T. Miyamoto, Y. Toba, H. Hara, T. Akazawa - Show less +2 more•Institutions (1)

Hitachi¹

01 Feb 1983-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: HSP architecture, LSI design, and a speech analysis application are described, which makes it possible to construct a compact speech analysis circuit by the LPC (PARCOR) method with two HSP's.

...read moreread less

Abstract: A single chip high-performance digital signal processor (HSP) has been developed for speech, telecommunication, and other applications. The HSP uses 3 μm CMOS technology and its architecture features floating point arithmetic and pipeline structure. By adoption of floating point arithmetic, data covering a wide dynamic range (up to 32 bits) can be manipulated. The input clock frequency is 16 MHz, and the instruction cycle time is 250 ns. Efficient signal processing instructions and a large internal memory (program ROM: 512 words; data RAM: 200 words; data ROM: 128 words) make it possible to construct a compact speech analysis circuit by the LPC (PARCOR) method with two HSP's. This paper describes HSP architecture, LSI design, and a speech analysis application.

...read moreread less

Proceedings Article•DOI•

A data flow processor array system: Design and analysis

[...]

Naohisa Takahashi, Makoto Amamiya

13 Jun 1983

TL;DR: The design philosophy of the data flow processor array system presented in this paper is to achieve high performance by adapting a system structure to operational characteristics of application programs, and also to attain flexibility through executing instructions based on a data driven mechanism.

...read moreread less

Abstract: This paper presents the architecture of a highly parallel processor array system which executes programs by means of a data driven control mechanism. The data driven control mechanism makes it easy to construct an MIMD (multiple instruction stream and multiple data stream) system, since it unifies inter-processor data transfer and intra-processor execution control. The design philosophy of the data flow processor array system presented in this paper is to achieve high performance by adapting a system structure to operational characteristics of application programs, and also to attain flexibility through executing instructions based on a data driven mechanism. The operational characteristics of the proposed system are analyzed using a probability model of the system behavior. Comparing the analytical results with the simulation results through an experimental hardware system, the results of the analysis clarify the principal effectiveness of the proposed system. This system can achieve high operation rates and is neither sensitive to inter-processor communication delay nor sensitive to system load imbalance.

...read moreread less

Proceedings Article•DOI•

Fault-Tolerant VLSI Systolic Arrays and Two-Level Pipelining

[...]

Hsiang-Tsung Kung¹, Monica S. Lam¹•Institutions (1)

Carnegie Mellon University¹

28 Nov 1983

TL;DR: The fault-tolerant scheme proposed maintains the original data flow patterns by simply by-passing defective cells with a small number of registers, maintaining the desirable properties of systolic arrays, such as local and regular communication, massive parallelism and high data throughput.

...read moreread less

Abstract: This paper addresses two important problems in systolic arrays: fault-tolerance and two-level pipelining. The fault-tolerant scheme we propose maintains the original data flow patterns by simply by-passing defective cells with a small number of registers. As a result, the desirable properties of systolic arrays, such as local and regular communication, massive parallelism and high data throughput, are all preserved. 'Iwo-level pipelining refers to the use of pipelined functional units in the implementation of systolic cells. This paper also addresses the problem of efficiently utilizing such units to increase overall system throughput. We show that both of these problems can be reduced to the same mathematical problem of incorporating extra delays on certain data paths in originally correct systolic designs. We introduce the mathematical notion of a cut which enables us to handle this problem systematically. The results obtained by applying the techniques described in this paper arc encouraging. When applied to systolic arrays without feedback cycles, the arrays can tolerate large numbers of faults with the addition of very little hardware, while maintaining the original throughput. Furthermore, all of the pipeline stages in the cells can be kept fully utilized through the addition of a small number of delay registers. However, adding delays to systolic arrays with cycles typically induces a significant decrease in throughput. In response to this, we have derived a new class of systolic algorithms in which the data cycle around a ring of processing cells The systolic ring architecture has the property of degrading gracefully as cells fail. It can be used in place of many systolic arrays with feedback cycles. Using our cut theory for arrays without feedback and the ring architecture approach for arrays with feedback, we have an effective fault-tolerant scheme for every systolic array that we have considered. Furthermore, as by-products of the ring architecture approach we have derived new systolic algorithms. These algorithms generally require only one-third to one-half of the number of cells used in previous designs to achieve the same throughput. Included in these new systolic algorithms are ones for LU-decomposition, QR-decomposition and the solution of triangular linear systems.© (1983) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

Journal Article•DOI•

Area—Time Optimal VLSI Circuits for Convolution

[...]

Baudet¹, Preparata, Vuillemin•Institutions (1)

Brown University¹

01 Jul 1983-IEEE Transactions on Computers

TL;DR: A family of VLSI circuits is presented to perform open convolution, i.e., polynomial multiplication, and, depending on the degree of paralleism or pipeline, they range from a compact but slow convolver to a large but very fast convolver.

...read moreread less

Abstract: A family of VLSI circuits is presented to perform open convolution, i.e., polynomial multiplication. The circuits are all based on a recursive construction and are therefore particularly well adapted to automated design. All the circuits presented are optimal with respect to the area–time2 tradeoff, and, depending on the degree of paralleism or pipeline, they range from a compact but slow convolver to a large but very fast convolver.

...read moreread less

Patent•

Method of hot tapping type branching and split joint used therein

[...]

Toshihiro Tsubakimoto¹, Koziro Hori¹, Toshio Shibabuti¹•Institutions (1)

Southern California Gas Company¹

28 Feb 1983

TL;DR: In this paper, a method of perforating a main pipeline while in service, and split joints suited to this method, in order to connect a branch pipeline to the main pipeline, is described.

...read moreread less

Abstract: The disclosed invention provides a novel method of perforating a main pipeline while in service, and split joints suited to this method, in order to connect a branch pipeline thereto. According to this method, perforating work is carried out from an opposite side across the main pipeline to where the branch pipeline is connected, as distinct from known methods in which perforating work is carried out on the side of the main pipeline to which a branch pipeline is connected.

...read moreread less

Patent•

Entry control store for enhanced CPU pipeline performance

[...]

Harris Richard Lee, Robert W. Horst

29 Sep 1983

TL;DR: In this article, an entry control store in a central processing unit (CPU) is addressed by the next macroinstruction to be executed by the CPU and fetches the microcode for the first line of that macro-instruction.

...read moreread less

Abstract: An entry control store in a central processing unit (CPU) is addressed by the next macroinstruction to be executed by the CPU and fetches the microcode for the first line of that macroinstruction. Subsequent lines of microcode for that macroinstruction are fetched from a main control store.

...read moreread less

Proceedings Article•DOI•

Vector reduction methods for arithmetic pipelines

[...]

Lionel M. Ni¹, Kai Hwang²•Institutions (2)

Michigan State University¹, Purdue University²

20 Jun 1983

TL;DR: Two new vector reduction methods, symmetric and asymmetric, are proposed and analyzed for pipelined processing and compare favorably with the known recursive reduction method in achieving higher pipeline utilization and in eliminating large memory for intermediate results.

...read moreread less

Abstract: Vector reduction arithmetic accepts a vector as input and produces a scalar output. This class of vector operations forms the basis of many scientific computations. In a pipelined processor, a feedback loop is required to reduce vectors. Since the output of the pipeline depends on previous outputs, improper control of the feedback loop will destroy the benefit from pipelining. A generalized computing model is proposed to schedule the activities in a vector reduction pipeline. Two new vector reduction methods, symmetric and asymmetric, are proposed and analyzed for pipelined processing. These two methods compare favorably with the known recursive reduction method in achieving higher pipeline utilization and in eliminating large memory for intermediate results. An interleaving method is proposed to reduce multiple vectors to multiple scalars in a single arithmetic pipeline. The pipeline can be fully utilized by interleaved multiple vector processing.

...read moreread less

Patent•

Cpu with multiple execution units

[...]

Russell W. Guenthner¹, Leonard G. Trubisky¹, Joseph C. Circello¹, Gregory C. Edgington¹•Institutions (1)

Honeywell¹

13 Oct 1983

TL;DR: In this article, a collector for the results of a pipelined central processing unit of a digital data processing system is presented, where results of the execution of each instruction are stored in a result stack 38, 40, 42, 44 associated with each execution unit.

...read moreread less

Abstract: A collector for the results of a pipelined central processing unit of a digital data processing system. The processor has a plurality of execution units 24, 26, 28, 30, with each execution unit executing a different set of instructions of the instruction repertoire of the processor. The execution units execute instructions issued to them in order of issuance by the pipeline 12 and in parallel. As instructions are issued to the execution units, the operation code identifying each instruction is also issued in program order to an instruction execution queue 18 of the collector. The results of the execution of each instruction by an execution unit are stored in a result stack 38, 40, 42, 44 associated with each execution unit. Collector control 46 causes the results of the execution of instructions to program visible registers to be stored in a master safe store register 48 in program order which is determined by the order of instructions stored in the instruction execution stack on a first-in, first-out basis. The collector also issues write commands to write results of the execution of instructions into memory in program order via a store stack 50.

...read moreread less

Patent•

Device for damping pipeline oscillations

[...]

Zoellner Kurt

01 Jun 1983

Proceedings Article•DOI•

Integrated optical circuits for numerical computation

[...]

C. M. Verber¹, R. P. Kenan¹•Institutions (1)

Battelle Memorial Institute¹

30 Nov 1983

TL;DR: Recent developments in the design of integrated optical circuits for performing optical numerical computations are discussed and the natural marriage of IOC's with the systolic concept is discussed.

...read moreread less

Abstract: The development of integrated optical circuits (IOC) for numerical-computation applications is reviewed, with a focus on the use of systolic architectures. The basic architecture criteria for optical processors are shown to be the same as those proposed by Kung (1982) for VLSI design, and the advantages of IOCs over bulk techniques are indicated. The operation and fabrication of electrooptic grating structures are outlined, and the application of IOCs of this type to an existing 32-bit, 32-Mbit/sec digital correlator, a proposed matrix multiplier, and a proposed pipeline processor for polynomial evaluation is discussed. The problems arising from the inherent nonlinearity of electrooptic gratings are considered. Diagrams and drawings of the application concepts are provided.

...read moreread less

Patent•DOI•

Methods and apparatus for indicating selected physical parameters in a pipeline

[...]

Nooy Burton Ver

16 Sep 1983-Journal of the Acoustical Society of America

TL;DR: In this article, a method of detecting inadequately supported sections or overloaded points in a pipeline including the steps of traversing the interior of the pipeline with an instrumentation pig, sequentially striking or vibrating the wall of a pipeline by means carried by the pig to introduce vibratory signals into the pipeline, receiving said signals from within the pipeline by listening to the sounds generated as a consequence of the striking of the interior wall and detecting preselected characteristics of received sound which are indicative of unsupported sections or of points of load and stress concentration in the pipeline.

...read moreread less

Abstract: A method of detecting inadequately supported sections or overloaded points in a pipeline including the steps of traversing the interior of the pipeline with an instrumentation pig, sequentially striking or vibrating the wall of the pipeline by means carried by the pig to introduce vibratory signals into the pipeline, receiving said signals from within the pipeline by listening to the sounds generated as a consequence of the striking of the interior wall, and detecting preselected characteristics of received sound which are indicative of unsupported sections or of points of load and stress concentration in the pipeline

...read moreread less

Proceedings Article•DOI•

An architecture for a speech recognition system

[...]

M. Lowy¹, H. Murveit, D. Mintz, R. W. Brodersen•Institutions (1)

University of California, Berkeley¹

01 Jan 1983

TL;DR: A realtime system using parallel processing and pipeline techniques to implement the pattern alignment operation will be reported, and the chip has been fabricated in 5μ NMOS technology.

...read moreread less

Abstract: A realtime system using parallel processing and pipeline techniques to implement the pattern alignment operation will be reported. A 3.5mm × 4.2mm chip has been fabricated in 5μ NMOS technology.

...read moreread less

Patent•

Arithmetic control apparatus for a pipeline processing system

[...]

Sakamoto Tsutomu

21 Jan 1983

TL;DR: In this paper, a microprogram-controlled type arithmetic control apparatus which performs pipeline processing is provided, in which a microinstruction corresponding to a macro-instruction to be processed is read out of a control storage at least one stage prior to an OF (operand fetch) stage.

...read moreread less

Abstract: A microprogram-controlled type arithmetic control apparatus which performs pipeline processing is provided. In the apparatus, a microinstruction corresponding to a macroinstruction to be processed are read out of a control storage at least one stage prior to an OF (operand fetch) stage. The microinstruction and data (source data) are respectively output onto a microinstruction bus and a data bus at the OF stage. By the beginning of an E (instruction execution) stage, data (source data) on the data bus and the microinstruction on the microinstruction bus are loaded into first and second registers, respectively. Then, at the beginning of the E stage, arithmetic operation by an arithmetic and logic unit can be performed immediately.

...read moreread less

Proceedings Article•DOI•

Techniques for solving graph problems in parallel environments

[...]

Peter H. Hochschilld, Ernst W. Mayr, Alan Siegel

07 Nov 1983

TL;DR: New paradigms for the construction of efficient parallel graph algorithms, called filtration and funnelled pipelining, are introduced and illustrated with VLSI circuits for computing connected components, minimum spanning forests, and biconnected components.

...read moreread less

Abstract: We introduce new paradigms for the construction of efficient parallel graph algorithms. These paradigms, called filtration and funnelled pipelining, are illustrated with VLSI circuits for computing connected components, minimum spanning forests, and biconnected components. These circuits use realistic I/O schedules and require time and area of O(n1+e). Thus they are essentially optimal. Filtration is a technique used to rapidly discard irrelevant input data. This greatly reduces storage, time, and communications costs in a wide variety of problems. A funnelled pipeline is obtained by building a series of increasingly thorough filter stages. Transition times along such a pipeline of filters form an exponentially increasing sequence. The increasing amount of time exactly balances the increasing degree of filtration. This balance makes possible the cascaded filtration critical to the minimum spanning forest and the biconnected components algorithms.

...read moreread less

Storage access-exception detection for pipelined execution units

[...]

L.C. Garcia, D.C. Tjon, S.G. Tucker

01 May 1983

TL;DR: A technique is described that signals a storage access-exception condition for a data word after an execution unit pipeline has completed processing all preceding elements.

...read moreread less

Abstract: A technique is described that signals a storage access-exception condition for a data word after an execution unit pipeline has completed processing all preceding elements.

...read moreread less

Journal Article•DOI•

A Single Chip Digital Signal Processor and Its Application to Real-Time Speech Analysis

[...]

Yoshimune Hagiwara¹, Y. Kita, T. Miyamoto, Y. Toba, H. Hara, T. Akazawa - Show less +2 more•Institutions (1)

Hitachi¹

01 Feb 1983-IEEE Journal of Solid-state Circuits

TL;DR: HSP architecture, LSI design, and a speech analysis application are described, which makes it possible to construct a compact speech analysis circuit by the LPC (PARCOR) method with two HSP's.

...read moreread less

Abstract: A single chip high-performance digital signal processor (HSP) has been developed for speech, telecommunication, and other applications. The HSP uses 3 µm CMOS technology and its architecture features floating point arithmetic and pipeline structure. By adoption of floating point arithmetic, data covering a wide dynamic range (up to 32 bits) can be manipulated. The input clock frequency is 16 MHz, and the instruction cycle time is 250 ns. Efficient signal processing instructions and a large internal memory (program ROM: 512 words; data RAM: 200 words; data ROM: 128 words) make it possible to construct a compact speech analysis circuit by the LPC (PARCOR) method with two HSP's. This paper describes HSP architecture, LSI design, and a speech analysis application.

...read moreread less

Journal Article•DOI•

Model of a cryogenic liquid-hydrogen pipeline for an airport ground distribution system

[...]

L. Jones, C. Wuschke, T.Z. Fahidy¹•Institutions (1)

University of Waterloo¹

01 Jan 1983-International Journal of Hydrogen Energy

TL;DR: In this paper, a model for the estimation of liquid-hydrogen pressure and temperature along a pipeline equipped with cryogenic insulation is presented together with a numerical simulation scheme and the summary of a sensitivity analysis.

...read moreread less

Patent•

Pipelined parallel vector processor including parallel configured element processors for processing vector elements in parallel fashion

[...]

Chuck H. Ngai¹, Edward R. Wassel¹, Gerald J. Watkins¹•Institutions (1)

IBM¹

09 Sep 1983

TL;DR: In this paper, a pipelined parallel vector processor is described, where the vector registers are subdivided into a plurality of smaller registers, and an element processor, functioning in a pipeline mode, is associated with each smaller register for processing the M elements of the vectors stored in the smaller register and generating results of the processing.

...read moreread less

Abstract: A pipelined parallel vector processor is disclosed. In order to increase the performance of the parallel vector processor, the present invention decreases the time required to process a pair of vectors stored in a pair of vector registers. The vector registers are subdivided into a plurality of smaller registers. A vector, stored in a vector register, comprises N elements; however, each of the smaller registers store M elements of the vector, where M is less than N. An element processor, functioning in a pipeline mode, is associated with each smaller register for processing the M elements of the vectors stored in the smaller register and generating results of the processing, the results being stored in one of the vector registers. The smaller registers of the vector registers, and their corresponding element processors, are structurally configured in a parallel fashion. The element processors and their associated smaller registers operate simultaneously. Consequently, processing of the N element vectors, stored in the vector registers, is complete in the time required to complete the processing of the M elements of the N element vector.

...read moreread less