scispace - formally typeset
Search or ask a question

Showing papers on "Parallel processing (DSP implementation) published in 1979"


Book
20 Mar 1979
TL;DR: A simulator for the parallel network system has been implemented in MACLISP, and an experimental version of NETL, a language for storing real-world information in such a network, is running on this simulator.
Abstract: : This report describes a knowledge-base system in which the information is stored in a network of small parallel processing elements--node and link units--which are controlled by an external serial computer. Discussed is NETL, a language for storing real-world information in such a network. A simulator for the parallel network system has been implemented in MACLISP, and an experimental version of NETL is running on this simulator. A number of test-case results and simulated timings will be presented. (Author)

580 citations


Patent
21 May 1979
TL;DR: The massively parallel processor architecture as discussed by the authors enables very high speed processing of large amounts of ordered, parallel data, including spatial translation by shifting or "sliding" of bits vertically or horizontally to neighboring processing elements.
Abstract: An apparatus for processing multidimensional data with strong spatial characteristics, such as raw image data, characterized by a large number of parallel data streams in an ordered array, comprises a large number (e.g. 16,384 in a 128×128 array) of parallel processing elements operating simultaneously and independently on single bit slices of a corresponding array of incoming data streams under control of a single set of instructions. Each of the processing elements comprises a bidirectional data bus in communication with a register for storing single bit slices together with a random access memory unit and associated circuitry, including a binary counter/shift register device, for performing logical and arithmetical computations on the bit slices, and an I/O unit for interfacing the bidirectional data bus with the data stream source. The massively parallel processor architecture enables very high speed processing of large amounts of ordered, parallel data, including spatial translation by shifting or "sliding" of bits vertically or horizontally to neighboring processing elements.

320 citations


Journal ArticleDOI
Siegel1
TL;DR: This analysis compares a number of single- and multistage networks that have been proposed for SIMD interconnection and shows how the approaches to this problem have changed over time.
Abstract: Many SIMD interconnection networks have been proposed. To put the different approaches into perspective, this analysis compares a number of single- and multistage networks.

160 citations


Patent
26 Nov 1979
TL;DR: Today's array processors provide a cost-effective tool for increasing the speed at which highly computation-bound processing jobs can be carried out.
Abstract: A high speed parallel array data processing architecture fashioned under a computational envelope approach includes a data base memory for secondary storage of programs and data, and a plurality of memory modules interconnected to a plurality of processing modules by a connection network of the Omega gender. Programs and data are fed from the data base memory to the plurality of memory modules and from hence the programs are fed through the connection network to the array of processors (one copy of each program for each processor). Execution of the programs occur with the processors operating normally quite independently of each other in a multiprocessing fashion. For data dependent operations and other suitable operations, all processors are instructed to finish one given task or program branch before all are instructed to proceed in parallel processing fashion on the next instruction. Even when functioning in the parallel processing mode however, the processors are not locked-step but execute their own copy of the program individually unless or until another overall processor array synchronization instruction is issued.

123 citations


Journal ArticleDOI
TL;DR: Overall conclusions indicate that the parallel algorithm is always much faster and sometimes has better convergence characteristics than the classical trapezoidal integration algorithm.
Abstract: The numerical method presented in this paper permits the solution of differential equations by trapezoidal integration in a time of order log2 T, where T is the number of discrete time steps required for the solution. The number of required parallel processors is T/2. Linear and nonlinear examples are presented. The nonlinear example corresponds to a small stability problem. The classical trapezoidal integration algorithm is compared to the new parallel trapezoidal algorithm in terms of solution time requirements. Also, for the nonlinear example the comparison includes the number of iterations and convergence characteristics. Overall conclusions indicate that the parallel algorithm is always much faster and sometimes has better convergence characteristics. Potential limitations of the method are also discussed.

120 citations


Proceedings ArticleDOI
01 Dec 1979
TL;DR: This paper is restricted to networks for geographically-localized parallel processing systems using 12 or more processors in a reconfigurable manner, as well as both fixed and dynamic word size systems.
Abstract: This is a survey of a variety of interconnection networks for reconfigurable parallel processing systems that have appeared in the literature. A system is reconfigurable if it may assume several architectural configurations, each of which is characterized by its own topology of activated interconnections between modules. 18 The systems whose networks will be examined include multiple-SIMD and MIMD systems, as well as both fixed and dynamic word size systems. This paper is restricted to networks for geographically-localized parallel processing systems using 12 or more processors in a reconfigurable manner. Related survey papers include References 1 , 3 , 10 , 19 , 20 , 45 - 47 .

76 citations


Journal ArticleDOI
TL;DR: It is shown analytically that the secondary‐image level is reduced and remains unchanged when the receiver angular aperture (aperture relative to distance) is limited and kept constant during the whole observation time.
Abstract: Ultrafast cardiac‐valve ultrasonic tomography requires parallel multichannel processing of received echoes. In parallel processing the level of secondary ’’ghost’’ images due to spatial undersampling is much higher than in slower series processors which use a selective field insonification. The paper describes a 20‐channel moving‐focus parallel‐processing analog electronic system, which is realized in our laboratory. It is shown analytically that the secondary‐image level is reduced and remains unchanged when the receiver angular aperture (aperture relative to distance) is limited and kept constant during the whole observation time.

76 citations


Journal ArticleDOI
TL;DR: The RAM model of Cook and Reckhow as discussed by the authors was extended to allow parallel recursive calls and the elementary theory of such machines was developed The uniform cost criterion was used The results include proofs of (!) the eqmvalence of non-deterministic and deterministic polynomml UMs for such parallel machines and (2) the equivalence of polynomial-time UMs on parallel machines.
Abstract: The RAM model of Cook and Reckhow ~s extended to allow parallel recursive calls and the elementary theory of such machines is developed The uniform cost criterion is used The results include proofs of (!) the eqmvalence of nondetermmtsUc and determm~sttc polynomml Ume for such parallel machines and (2) the eqmvalence of polynomml tmae on such parallel machines and polynomml space on ordinary nonparallel RAM's Also included are results showing that parallelism appears to be stnctly more powerful than nondeter-

63 citations


Book ChapterDOI
06 Nov 1979
TL;DR: The process of creative visualization is of pictures as a whole, and conventional computer image processing could be broadly categorized as manipulation of pixel states rather than pictorial content.
Abstract: Conventional computers do not readily lend themselves to picture processing. Digital image manipulation by conventional computer is accomplished only at a tremendous cost in time and conceptual distraction. Computer image processing is the activity of modifying a picture such that retrieval of relevant pictorially encoded information becomes trivial. Algorithm development for image processing is an alternating sequence of inspired creative visualizations of desired processed results and the formal procedures implementing the desired process on a particular image processing system. But our process of creative visualization is of pictures as a whole. Implementation of the visualized image manipulation by conventional computer requires fragmentation of the pictorial concept into information units matched to the word oriented capabilities of general purpose machines. Conventional computer image processing could be broadly categorized as manipulation of pixel states rather than pictorial content.

57 citations


Patent
31 Dec 1979
TL;DR: In this paper, a television signal noise reduction system employs a movement detector for automatically selecting one of three parallel processing paths, the first passing the video signal without modification and the other two calculating different weighted averages of the current pixel and surrounding pixels from the current frame and the previous frame.
Abstract: A television signal noise reduction system employs a movement detector for automatically selecting one of three parallel processing paths, the first of which passes the video signal without modification and the other two of which calculate different weighted averages of the current pixel and surrounding pixels from the current frame and the previous frame.

53 citations


Journal ArticleDOI
TL;DR: This work considers the problem of triangulating a sparse matrix in a parallel processing system and attempts to answer the following questions: how should the rows and columns of the matrix be reordered in order to minimize the completion time of the parallel triangulation process if an unrestricted number of processors are used.
Abstract: We consider the problem of triangulating a sparse matrix in a parallel processing system and attempt to answer the following questions: 1) How should the rows and columns of the matrix be reordered in order to minimize the completion time of the parallel triangulation process if an unrestricted number of processors are used? 2) If the number of processors is fixed, what is the minimum completion time and how should the parallel operations be scheduled? Implementation of the parallel algorithm is discussed and experimental results are given

Journal ArticleDOI
Su1
TL;DR: Cellular-logic devices, using a parallel processing element for each element of a rotating memory, allow fast data search and manipulation.
Abstract: Cellular-logic devices, using a parallel processing element for each element of a rotating memory, allow fast data search and manipulation.

Journal ArticleDOI
Berra1, Oliver
TL;DR: With content addressing and parallel processing capabilities, associative array processors are potentially useful for data base management.
Abstract: With content addressing and parallel processing capabilities, associative array processors are potentially useful for data base management.

Journal ArticleDOI
Smith1
TL;DR: Relational data base machines using head-per-track disk technology or its electronic equivalent can move processing logic closer to the data, providing simplified storage organizations for large-scale applications.
Abstract: Relational data base machines using head-per-track disk technology or its electronic equivalent can move processing logic closer to the data, providing simplified storage organizations for large-scale applications.

01 Dec 1979
TL;DR: It will be demonstrated that many kinds of heuristic search that are commonly implemented using backtracking can be reformulated to use parallel processing with advantage in control over the problem solving behavior.
Abstract: : Parallel processing as a conceptual aid in the design of programs for problem solving applications is developed. A pattern-directed invocation language known as Ether is introduced. Ether embodies two notions in language design: activities and viewpoints. Activities are the basic parallel processing primitive. Different goals of the system can be pursued in parallel by placing them in separate activities. Language primitives are provided for manipulating running activities. Viewpoints are a generalization of context mechanisms and serve as a device for representing multiple world models. A number of problem solving schemes are developed making use of viewpoints and activities. It will be demonstrated that many kinds of heuristic search that are commonly implemented using backtracking can be reformulated to use parallel processing with advantage in control over the problem solving behavior. The semantics of Ether are such that things as deadlock and race conditions that plague many languages for parallel processing cannot occur. The programs presented are quite simple to understand. (Author)

Journal ArticleDOI
TL;DR: In this article, the authors give explicit algorithms in square-root form that allow measurements for the standard state estimation problem to be processed in a highly parallel fashion with little communication between processors, and then blocks of measurements may be incorporated into state estimates with essentially the same computation as usually accompanies the incorporation of a single measurement.

Dissertation
02 Jul 1979
TL;DR: The research described here attacks the problem of constructing more powerful and more flexible computer systems along three fronts: the definition of a virtual machine providing for parallel computation using objects and object references, the development of a distributed implementation mechanism supporting object management functions including garbage collection, and the investigation of scheduling algorithms and collection of performance results.
Abstract: : A current-technology computing machine may be roughly decomposed into a processor, a memory, and a data path connecting them. The interposition of this data path between processing and storage elements creates a bottleneck, which inhibits progress at the high-performance end of the technological spectrum. Additionally, the monolithic nature of present-day processors resists incremental addition or removal of processing power. The research described here attacks the problem of constructing more powerful and more flexible computer systems along three fronts: the definition of a virtual machine providing for parallel computation using objects and object references, the development of a distributed implementation mechanism ('reference trees') supporting object management functions including garbage collection, and the investigation of scheduling algorithms and collection of performance results. A reference tree network using theses concepts is composed of a multiple of independent small processors, yet operates as a coherent programming system. Programs and data spread automatically and transparently through the network to occupy underused resources. The modular structure of the network provides many parallel data paths as well as allowing for easy addition or removal of modules, thus addressing some of the problems discussed here. A prototype reference tree network, the MuNer, is currently in operation. (Author)

Journal ArticleDOI
TL;DR: The Q-spline interpolation method is presented, designed for incremental curve definition, local curve modification, “on-the-curve” control points and computational efficiency in array processing environment.

Patent
26 Mar 1979
TL;DR: In this paper, a mathematical transformation based solely on AND and OR connections for the purpose of recognizing patterns, and a process for localizing this pattern within a zone or range is described.
Abstract: A circuit is described which makes use of a mathematical transformation based solely on AND and OR connections for the purpose of recognizing patterns, and a process for localizing this pattern within a zone or range. By means of the direct connection of AND and OR gates, a completely parallel processing of each individual bit is achieved in the arithmetic and logic unit so implemented, which signifies a fraction of the processing time as compared to full adders. Simultaneously the equipment is capable of recognizing a specific pattern (object) from a multitude of mutually nested structures (neighborhood) and to localize it by a subsequent procedure.

Journal ArticleDOI
TL;DR: A series of design, software simulation, and fabrication studies is underway to develop a special-purpose high-speed reconstruction computer that will rely upon integrated circuit arithmetic components of advanced design, and highly parallel architecture to execute X-ray based transaxial reconstruction algorithms at the rate of hundreds of cross sections/sec.
Abstract: In order to achieve the computational capability to carry out many thousands of cross-sectional reconstructions, necessary to support a prototype high temporal and spatial resolution cylindrical scanning multiaxial tomographic unit, a series of design, software simulation, and fabrication studies is underway to develop a special-purpose high-speed reconstruction computer. This processor will rely upon integrated circuit arithmetic components of advanced design, and highly parallel architecture to execute X-ray based transaxial reconstruction algorithms at the rate of hundreds of cross sections/sec.

Proceedings ArticleDOI
06 Nov 1979
TL;DR: A FORTRAN program has been written for a manipulator and is being applied to the Stanford maniputator with a Z.8000 microcomputer.
Abstract: The advantage of using a Multiprocessor controller for a mechanical manipulator is that parallel computations may be arranged to achieve a minimum computing time so that a reat time control is possible. The parallel processing scheme utilizes a number of CPU's and pursues the following steps. First divide the entire task into subtasks. Based on the precedence relations, an optimum order of execution for each CPU is obtained by using an algorithm which includes, afternatively, forward and backward phases. in each forward phase it seeks the currently available, shorter computing time, while in each backward phase it seLects a better alternative. A FORTRAN program has been written for a manipulator and is being applied to the Stanford maniputator with a Z.8000 microcomputer.

Journal ArticleDOI
TL;DR: This paper puts forward a design for a computer system based on an array of "Single chip" microcomputers that supports the requirements of Highly Reliable Software for execution of parallel programs.
Abstract: This paper puts forward a design for a computer system based on an array of "Single chip" microcomputers. The design supports the requirements of Highly Reliable Software for execution of parallel programs.

Journal ArticleDOI
01 Oct 1979
TL;DR: In this article, reaction times to color/color, color/colour-name, color name/color name, and color/associate pairs were measured under simultaneous pairing and priming conditions.
Abstract: Reaction times to color/color, color/color-name, color-name/color-name, and color/associate pairs were measured under simultaneous pairing and priming conditions. The results indicated that the briefest reaction time occurred under color-to-color matching, but that the reaction time latencies among conditions were similar when the prime preceded the to-be-matched item by 1,500 msec. The results were interpreted in terms of a modified parallel processing model.

Patent
08 Oct 1979
TL;DR: In this article, the authors proposed a process dividing computer group system that permits respective computers to perform different processing, by making respective computers perform parallel processing according to the contents read out of a processing memory unit.
Abstract: PURPOSE: To improve reliability by performing efficient operation by a process dividing computer group system that permits respective computers to perform different processing, by making respective computers perform parallel processing according to the contents read out of a processing memory unit. CONSTITUTION: As the activity of the system starts, processing dividing device 2 extracts each processing from processing and memory unit 1 according to internally stored reference data and assigns it to computer group 3. According to the specified processing, computer group 3 is run for the processing and computers 3 1 and 3 2 , for example, are run for different processing in parallel. If computer 3 2 stops, a fault signal is sent to monitor unit 5. Unit 5 having received said signal sends to device 2 an interruption signal that informs it of the fault of computer 3 2 . In response to this interruption signal, device 2 divides the processing again. Consequently, the multiple computer system can be operated with efficiency and the reliability of the system can also be improved. COPYRIGHT: (C)1981,JPO&Japio

Proceedings ArticleDOI
15 May 1979
TL;DR: The Load Flow problem is treated as a minimization problem and is solved using a parallel nangradient optimization procedure similar to the one suggested by Chazan and Miranker, and a speed-up nearly equal to q is possible.
Abstract: The Load Flow problem is treated as a minimization problem and is solved using a parallel nangradient optimization procedure similar to the one suggested by Chazan and Miranker. The algorithm is described and test case results are presented. A speed-up nearly equal to q is possible if a parallel computer with q processors is used for the solution of the problem.

ReportDOI
01 Jul 1979
TL;DR: This paper studies local reconfiguration of trees into arrays and vice versa and also studies the construction of adjacency graphs and quadtrees for images stored in cellular array processors.
Abstract: : This paper studies local reconfiguration of trees into arrays and vice versa. It also studies the construction of adjacency graphs and quadtrees for images stored in cellular array processors. (Author)


Journal ArticleDOI
TL;DR: It is shown, that a high level system, which provides virtually unlimited parallel processing capability independent from geographical distribution can in two steps be mapped to a low level system which only knows sequential processors.

Patent
29 Jun 1979
TL;DR: In this article, the authors proposed a system where the data processor selects the processing of each data processor to either of parallel and master slave processing modes by giving the functions requesting the processing to the data processors to the controller receiving the data inputted from the terminal unit, and the controller is provided with the distribution circuit parallely feeding the data to the units 1-0 and 1-1 and feeding either of them selectively.
Abstract: PURPOSE:To enable to select the processing of each data processor to either of parallel and master slave processing mode, by giving the functions requesting the processing to the data processor, to the controller receiving the data inputted from the terminal unit. CONSTITUTION:In the system providing the controller 2 receiving the data from a plurality of terminal units 3-0, 3-1 ... 3-n, and the data processors 1-0 and 1-1 executing the processing based on the data from the controller 2, the controller 2 is provided with the distribution circuit 4 parallely feeding the data to the units 1-0 and 1-1 and feeding either of them selectively, collation circuit 3 collating the result of processing, and the selection circuit 5 connecting the controller 2 to the units 1-0 and 1-1. Further, the selection control unit 9 is controlled with the controller 2, and the units 1-1 and 1-0 are taken as parallel processing mode or master slave processing mode in which one is taken as master and another is taken as slave processor.

Journal ArticleDOI
TL;DR: A framework for an application‐oriented language processor is described which is based on a series of independent modules operating in parallel, and the way in which this allows modules to be added or removed at will is discussed.
Abstract: A framework for an application-oriented language processor is described which is based on a series of independent modules operating in parallel. All intermodule communication is handled by means of monitor procedures. The way in which this allows modules to be added or removed at will is discussed, and the implications of this approach for adaptable processors are described, with particular reference to the numerical control of machine tools.