scispace - formally typeset
Search or ask a question

Showing papers on "Distributed memory published in 1979"


Journal ArticleDOI
TL;DR: The design of DIRECT, a multiprocessor organization for supporting relational database management systems is presented, which has a multiple-instruction multiple-data stream (MIMD) architecture that can simultaneously support both intra-query and inter-query concurrency.
Abstract: The design of DIRECT, a multiprocessor organization for supporting relational database management systems is presented. DIRECT has a multiple-instruction multiple-data stream (MIMD) architecture. It can simultaneously support both intra-query and inter-query concurrency. The number of processors assigned to a query is dynamically determined by the priority of the query, the type and number of relational algebra operations it contains, and the size of the relations referenced. Since DIRECT iS a virtual memory machine, the maximum relation size is not limited to that of the associative memory as in some other database machines. Concurrent updates are controlled through the use of locks on relations which are maintained by a controlling processor.

220 citations


Patent
26 Nov 1979
TL;DR: Today's array processors provide a cost-effective tool for increasing the speed at which highly computation-bound processing jobs can be carried out.
Abstract: A high speed parallel array data processing architecture fashioned under a computational envelope approach includes a data base memory for secondary storage of programs and data, and a plurality of memory modules interconnected to a plurality of processing modules by a connection network of the Omega gender. Programs and data are fed from the data base memory to the plurality of memory modules and from hence the programs are fed through the connection network to the array of processors (one copy of each program for each processor). Execution of the programs occur with the processors operating normally quite independently of each other in a multiprocessing fashion. For data dependent operations and other suitable operations, all processors are instructed to finish one given task or program branch before all are instructed to proceed in parallel processing fashion on the next instruction. Even when functioning in the parallel processing mode however, the processors are not locked-step but execute their own copy of the program individually unless or until another overall processor array synchronization instruction is issued.

123 citations


Book
01 Jan 1979
TL;DR: Parallel applications: Computational fluid dynamics.
Abstract: General concepts: Speedup.- Efficiency.- Redundancy.- Efficiency.- Isoefficiency.- Amdahls law.- Computer architecture: Concepts.- Sequential consistency. Relax consistency. Memory models. Cache coherency. Synchronization instructions. Synchronization devices. Interconnection networks. Branch prediction. Instruction level parallelism. Transactional memories. Thread-level speculation. Latency hiding. Atomicity. Fences.- Parallel machine designs.- Shared-Memory Multiprocessors. Cache-only memory architectures. Multicores. Clusters. Distributed memory machines. Distributed-shared memory machines. Array machines. Pipelined vector machines. Mainframes. Dataflow machines. VLIW machines. EPIC machines. SMT machines. GPUs. Multimedia extensions (SSE, Altivec). Superscalar machines. FPGAs.- Machines.- Illiac IV. Cray , Cray 2, Cray X-MP,... Denelcor HEP. Tera. Multiflow. Connection Machine, CM-2. Maspar. Ultracomputer. Intel hypercube. Fujitsu series. Hitachi series. NEC series. IBM's Blue Gene. IBM's cell processor. C.mmp. Cm*. Cedar. Flash. Alliant multiprocessors. Convex machines. KSR machines.- Benchmarks.- LINPACK. Perfect benchmarks. NAS Benchmarks. SPEC HPG benchmarks suites. Flash Benchmarks. TOP500.- Parallel programming: Concepts.- Implicit parallelism. Explicit parallelism. Process. Task. Thread. Thread safe routines. Locality. Communication. Races. Nondeterminacy. Monitors. Semaphores. Deadlock. Livelock. Scheduling theory. Loop scheduling. Affinity scheduling. Task stealing. Futures. Critical region. Producer consumer. Communicating Seqential Processes. Doall. Doacross. MapReduce. Data and control dependence. Dependence analysis. Autoparallelization. Run-time speculation. Inspector/executor. Software pipelining.- Designs.- Languages. Libraries. Tools.- Algorithms: Concepts.- Synchronous. Asynchronous. Systolic algorithms. Cache oblivious algorithms.- Algorithms.- Numerical. Graph. Sorting. Garbage collection. Data management.- Libraries.- SuperLU. Pardiso. SPIKE. FFTW. Spira.- Parallel applications: Computational fluid dynamics.- Bio-molecular simulation (NMAD).- Cosmology.- Quantum Chemistry.

54 citations


Patent
08 Mar 1979
TL;DR: In this paper, the array processor is controlled by a 76 bit microcode extension to one sector of a number of sectors of a control store in the CPU, which can be overriden by interrupt and other control signals generated by the CPU.
Abstract: An array processor which is an integral part of a central processing unit (CPU) has a local memory which is part of main memory address space. Furthermore, the array procesor has its own port into the local memory, leaving a system bus free while the array processor is working. The array processor is controlled so that data can be transferred between the main memory and the local memory either before, during, or after operation of data manipulation hardware which is part of the array processor. This data manipulation hardware utilizes a fast multiplier, and fast add, subtract, & compare circuitry. The array processor is controlled by a 76 bit microcode extension to one sector of a number of sectors of a control store in the CPU. The microcode extension can be overriden by interrupt and other control signals generated by the CPU.

54 citations


Journal ArticleDOI
TL;DR: An approximate analysis is performed of an often studied model of an interleaved memory, multiprocessor system consisting of M memory modules and N processors and the one approximation used is found to result in negligible error.
Abstract: An approximate analysis is performed of an often studied model of an interleaved memory, multiprocessor system consisting of M memory modules and N processors. A closed-form solution is obtained and the one approximation used is found to result in negligible error. This solution is about an order of magnitude more accurate than the best previous result.

51 citations


Journal ArticleDOI
TL;DR: The combined effect of interference due to bus contentions and that due to memory conflicts is investigated and an algorithm is developed to obtain a mapping which can be used to minimize memory conflicts.
Abstract: This paper studies one of the most difficult and important problems in the design of cost-effective multiple-microprocessor systems. The combined effect of interference due to bus contentions and that due to memory conflicts is investigated. Reference models are defined and applied to obtain analytic results which can prove to be valuable tools in the design of multiple-microprocessor systems for time-critical control processes. The effect of memory mapping is also investigated and an algorithm is developed to obtain a mapping which can be used to minimize memory conflicts.

45 citations


Proceedings Article
20 Aug 1979
TL;DR: A computer memory organization modeled after human memory is proposed, along with strategies for accessing that memory, and a program which implements the theory is described.
Abstract: A computer memory organization modeled after human memory is proposed, along with strategies for accessing that memory. CYRUS, a program which implements the theory, is described.

35 citations


Patent
15 Jan 1979
TL;DR: In this article, a data processing system is disclosed in which a high-speed processor is added to a slow-speed processors and in which both processors have access to a first memory unit with the slow processor having access priority over the fast processor.
Abstract: A data processing system is disclosed in which a high-speed processor is added to a slow-speed processor and in which both processors have access to a first memory unit with the slow processor having access priority over the fast processor. In order to allow the fast processor to operate without losing data when a conflict occurs during a write operation, a second memory is coupled to the fast processor in which is stored all the data stored in the first memory. When the fast processor attempts to write into both memories but fails to write into the first memory due to a conflict with the slow processor, the data stored in the second memory is then transferred to the first memory subsequent to the completion of the access operation by the slow processor. This arrangement allows the fast processor to complete the write operation interrupted by the conflicts with the slow processor, thereby allowing the fast processor and the slow processor to have access to the same data. Both memories are continuously balanced by the fast processor so that each memory will contain the same data allowing both processors access to the same data.

26 citations


Proceedings ArticleDOI
23 Apr 1979
TL;DR: X-NODE is a single-chip VLSI processor to be realized in the mid 1980's and to be used as a building block for a tree-structured multiprocessor system (X-TREE).
Abstract: X-NODE is a single-chip VLSI processor to be realized in the mid 1980's and to be used as a building block for a tree-structured multiprocessor system (X-TREE). Three major trends influence the design of this processor: the continuing evolution of VLSI technology, the requirements for parallelism and communication in a multiprocessor system, and the need for better support of software and high level language constructs. The influence of these trends on the processor architecture are discussed and the current state of the design of X-NODE is outlined. X-NODE will introduce several new features exploiting the full potential of VLSI technology. The processor and hierarchical memory of multiple device types will be combined on a single chip to provide a powerful processor. With basically a memory-to-memory architecture, an on-chip caching scheme provides the performance of a register based architecture. This on-chip memory hierarchy contains program and data, as well as microcode. The instruction set of any processor can thus be dynamically changed and tailored to the specific problem being executed. It is planned to support high level language constructs directly in hardware through mechanisms such as bounds checking.

16 citations


01 Aug 1979
TL;DR: PASM, a Large-scale multimicroprocessor system being designed at Purdue University for image processing and pattern recognition, is described and examples of how PASM can be used to perform image processing tasks are given.
Abstract: : PASM, a Large-scale multimicroprocessor system being designed at Purdue University for image processing and pattern recognition, is described. This system can be dynamically reconfigured to operate as one or more independent SIMD and/or MIMD machines. PASM consists of a Parallel Computation Unit, which contains N processors, N memories, and an interconnection network; Q Micro Controllers, each of which controls N/Q processors; N/Q parallel secondary storage devices; a distributed Memory Management System; and a System Control Unit, to coordinate the other system components. Possible values for N and Q are 1024 and 16, respectively. The control schemes, interprocessor communications, and memory management in PASM are explored. Examples of how PASM can be used to perform image processing tasks are given. (Author)

6 citations


Journal ArticleDOI
Faye A. Briggs1
13 Aug 1979
TL;DR: It is shown how effective buffering can be used to reduce the system cost while effectively maintaining a high level of performance in a multiprocessor system.
Abstract: A simulation model is developed and used to study the effect of buffering of memory requests on the performance of multiprocessor systems.A multiprocessor system is generalized as a parallel-pipelined processor of order (s,p), which consists of p parallel processors each of which is a pipelined processor with s degrees of multiprogramming, there can be up to s*p memory requests in each instruction cycle. The memory, which consists of N(=2n) identical memory modules, is organized such that there are l(=2i) lines and m(=2n-i) identical memory modules, where each module is characterized by the address cycle (address hold time) and memory cycle of a and c time units respectively.Too large an l is undesirable in a multiprocessor system because of the cost of the processor-memory interconnection network. Hence, we will show how effective buffering can be used to reduce the system cost while effectively maintaining a high level of performance.

Patent
11 Sep 1979
TL;DR: In this paper, the authors propose to avoid the fixture of function for each processor by loading the program into the common memory while deciding the main processor and then loading the self-corresponding program into individual memory.
Abstract: PURPOSE: To avoid the fixture of function for each processor, by loading the program into the common memory while deciding the main processor and then loading the self-corresponding program into the individual memory. CONSTITUTION: With push of the initial set start key, the circulating interruption control circuit 12 decides an interruption destination process and then applies interruption to e.g. processor #0. Thus the processor #0 carries out the program of the program loader existing on the common memory 1, and then loads the program corresponding to the processor #1 into the memory 1 to decide whether the processor #1 can work or not. And in case the processor #1 is impossible to have an operation, the program corresponding to e.g. processor #2 is loaded into the memory 1. And in case the operation is possible, interruption is applied. Thus the processor receives the application of interruption, such as the processor #2, loads the program to the individual memory of its own from the memory 1. COPYRIGHT: (C)1981,JPO&Japio


Journal ArticleDOI
TL;DR: A multiprocessor microcomputer which can be configured and programmed as an intelligent control station or dispatcher center in a distributed hierarchical process control system and a universal real-time multitask monitor.


12 Nov 1979
TL;DR: The Supervisory Control and Diagnostics System (SCDS) for the Mirror Fusion Test Facility (MFTF) is a multiprocessor minicomputer system designed so that for most single-point failures, the hardware may be quickly reconfigured to provide continued operation of the experiment.
Abstract: The Supervisory Control and Diagnostics System (SCDS) for the Mirror Fusion Test Facility (MFTF) is a multiprocessor minicomputer system designed so that for most single-point failures, the hardware may be quickly reconfigured to provide continued operation of the experiment. The system is made up of nine Perkin-Elmer computers - a mixture of 8/32's and 7/32's. Each computer has ports on a shared memory system consisting of two independent shared memory modules. Each processor can signal other processors through hardware external to the shared memory. The system communicates with the Local Control and Instrumentation System, which consists of approximately 65 microprocessors. Each of the six system processors has facilities for communicating with a group of microprocessors; the groups consist of from four to 24 microprocessors. There are hardware switches so that if an SCDS processor communicating with a group of microprocessors fails, another SCDS processor takes over the communication.