scispace - formally typeset
Search or ask a question

Showing papers on "Distributed memory published in 1980"


Book
01 Jan 1980
TL;DR: This book discusses Associative Memory, Content Addressing, and Associative Recall, and the CAM by the Linear-Select Memory Principle, as well as Logic Principles of Content-Addressable Memories.
Abstract: 1 Associative Memory, Content Addressing, and Associative Recall.- 1.1 Introduction.- 1.1.1 Various Motives for the Development of Content-Addressable Memories.- 1.1.2 Definitions and Explanations of Some Basic Concepts.- 1.2 The Two Basic Implementations of Content Addressing.- 1.2.1 Software Implementation: Hash Coding.- 1.2.2 Hardware Implementation: The CAM.- 1.3 Associations.- 1.3.1 Representation and Retrieval of Associated Items.- 1.3.2 Structures of Associations.- 1.4 Associative Recall: Extensions of Concepts.- 1.4.1 The Classical Laws of Association.- 1.4.2 Similarity Measures.- 1.4.3 The Problem of Infinite Memory.- 1.4.4 Distributed Memory and optimal Associative Mappings.- 1.4.5 Sequential Recollections.- 2 Content Addressing by Software.- 2.1 Hash Coding and Formatted Data Structures.- 2.2 Hashing Functions.- 2.3 Handling of Collisions.- 2.3.1 Some Basic Concepts.- 2.3.2 Open Addressing.- 2.3.3 Chaining (Coalesced).- 2.3.4 Chaining Through a Separate overflow Area.- 2.3.5 Rehashing.- 2.3.6 Shortcut Methods for Speedup of Searching.- 2.4 Organizational Features and Formats of Hash Tables.- 2.4.1 Direct and Indirect Addressing.- 2.4.2 Basic Formats of Hash Tables.- 2.4.3 An Example of Special Hash Table Organization.- 2.5 Evaluation of Different Schemes in Hash Coding.- 2.5.1 Average Length of Search with Different Collision Handling Methods.- 2.5.2 Effect of Hashing Function on the Length of Search.- 2.5.3 Special Considerations for the Case in Which the Search Is Unsuccessful.- 2.6 Multi-Key Search.- 2.6.1 Lists and List Structures.- 2.6.2 An Example of Implementation of Multi-Key Search by Hash Index Tables.- 2.6.3 The Use of Compound Keywords in Hashing.- 2.7 Implementation of Proximity Search by Hash Coding.- 2.8 The TRIE Memory.- 2.9 Survey of Literature on Hash Coding and Related Topics.- 3 Logic Principles of Content-Addressable Memories.- 3.1 Present-Day Needs for Hardware CAMs.- 3.2 The Logic of Comparison Operations.- 3.3 The All-Parallel CAM.- 3.3.1 Circuit Logic of a CAM Bit Cell.- 3.3.2 Handling of Responses from the CAM Array.- 3.3.3 The Complete CAM Organization.- 3.3.4 Magnitude Search with the All-Parallel CAM.- 3.4 The Word-Parallel, Bit-Serial CAM.- 3.4.1 Implementation of the CAM by the Linear-Select Memory Principle.- 3.4.2 Skew Addressing.- 3.4.3 Shift Register Implementation.- 3.4.4 The Results Storage.- 3.4.5 Searching on More Complex Specifications.- 3.5 The Word-Serial, Bit-Parallel CAM.- 3.6 Byte-Serial Content-Addressable Search.- 3.6.1 Coding by the Characters.- 3.6.2 Specifications Used in Document Retrieval.- 3.6.3 A Record-Parallel, Byte-Serial CAM for Document Retrieval.- 3.7 Functional Memories.- 3.7.1 The Logic of the Bit Cell in the FM.- 3.7.2 Functional Memory 1.- 3.7.3 Functional Memory 2.- 3.7.4 Read-only Functional Memory.- 3.8 A Formalism for the Description of Micro-Operations in the CAM.- 3.9 Survey of Literature on CAMs.- 4 CAM Hardware.- 4.1 The State-of-the-Art of the Electronic CAM Devices.- 4.2 Circuits for All-Parallel CAMs.- 4.2.1 Active Electronic Circuits for CAM Bit Cells.- 4.2.2 Cryotron-Element CAMs.- 4.2.3 Josephson Junctions and SQUIDs for Memories.- 4.3 Circuits for Bit-Serial and Word-Serial CAMs.- 4.3.1 Semiconductor RAM Modules for the CAM.- 4.3.2 Magnetic Memory Implementations of the CAM.- 4.3.3 Shift Registers for Content-Addressable Memory.- 4.3.4 The Charge-Coupled Device (CCD).- 4.3.5 The Magnetic-Bubble Memory (MBM).- 4.4 Optical Content-Addressable Memories.- 4.4.1 Magneto-Optical Memories.- 4.4.2 Holographic Content-Addressable Memories.- 5 The CAM as a System Part.- 5.1 The CAM in Virtual Memory Systems.- 5.1.1 The Memory Hierarchy.- 5.1.2 The Concept of Virtual Memory and the Cache.- 5.1.3 Memory Mappings for the Cache.- 5.1.4 Replacement Algorithms.- 5.1.5 Updating of Multilevel Memories.- 5.1.6 Automatic Control of the Cache Operations.- 5.1.7 Buffering in a Multiprocessor System.- 5.1.8 Additional Literature on Memory Organizations and Their Evaluation.- 5.2 Utilization of the CAM in Dynamic Memory Allocation.- 5.2.1 Memory Map and Address Conversion.- 5.2.2 Loading of a Program Segment.- 5.3 Content-Addressable Buffer.- 5.4 Programmable Logic.- 5.4.1 RAM Implementation.- 5.4.2 CAM Implementation.- 5.4.3 FM Implementation.- 5.4.4 Other Implementations of Programmable Logic.- 5.4.5 Applications of the CAM in Various Control Operations.- 6 Content-Addressable Processors.- 6.1. Some Trends in Content-Addressable Memory Functions.- 6.2 Distributed-Logic Memories (DLMs).- 6.3 The Augmented Content-Addressable Memory (ACAM).- 6.4 The Association-Storing Processor (ASP).- 6.5 Content-Addressable Processors with High-Level Processing Elements.- 6.5.1 The Basic Array Processor Architecture.- 6.5.2 The Associative Control Switch (ACS) Architecture.- 6.5.3 An Example of Control-Addressable Array Processors: RADCAP.- 6.5.4 An Example of Content-Addressable Ensemble Processors: PEPE.- 6.6 Bit-Slice Content-Addressable Processors.- 6.6.1 The STARAN Computer.- 6.6.2 Orthogonal Computers.- 6.7 An Overview of Parallel Processors.- 6.7.1 Categorizations of Computer Architectures.- 6.7.2 Survey of Additional Literature on Content-Addressable and Parallel Processing.- 7 Review of Research Since 1979.- 7.1 Research on Hash Coding.- 7.1.1 Review Articles.- 7.1.2 Hashing Functions.- 7.1.3 Handling of Collisions.- 7.1.4 Hash Table Organization.- 7.1.5 Linear Hashing.- 7.1.6 Dynamic, Extendible, and External Hashing.- 7.1.7 Multiple-Key and Partial-Match Hashing.- 7.1.8 Hash-Coding Applications.- 7.1.9 Hash-Coding Hardware.- 7.2 CAM Hardware.- 7.2.1 CAM Cells.- 7.2.2 CAM Arrays.- 7.2.3 Dynamic Memories.- 7.2.4 CAM Systems.- 7.3 CAM Applications.- 7.4 Content-Addressable Parallel Processors.- 7.4.1 Architectures for Content-Addressable Processors.- 7.4.2 Data Base Machines.- 7.4.3 Applications of Content-Addressable Processors.- 7.5 Optical Associative Memories.- 7.5.1 General.- 7.5.2 Holographic Content-Addressable Memories.- References.

368 citations


Proceedings ArticleDOI
06 May 1980
TL;DR: A model of a large multiprocessor system is developed and techniques are given by which each processing element can correctly diagnose failures in all other processing elements in the system.
Abstract: Techniques for dealing with hardware failures in very large networks of distributed processing elements are presented. A concept known as distributed fault-tolerance is introduced. A model of a large multiprocessor system is developed and techniques, based on this model, are given by which each processing element can correctly diagnose failures in all other processing elements in the system. The effect of varying system interconnection structures upon the extent and efficiency of the diagnosis process is discussed, and illustrated with an example of an actual system.Finally, extensions to the model, which render it more realistic, are given and a modified version of the diagnosis procedure is presented which operates under this model.

198 citations


Patent
19 Nov 1980
TL;DR: In this article, an automatic fault recovery system for a multiple processor control complex of a telecommunications switching system is described, which deals with the occurrence of soft faults and localizes the effect of errors, with the goal of minimizing disruption of calls through the switching system.
Abstract: An automatic fault recovery system for a multiple processor control complex of a telecommunications switching system is disclosed. The fault recovery system has a hierarchical structure which deals with the occurrence of soft faults and localizes insofar as possible the effect of errors, with the goal of minimizing disruption of calls through the switching system. Included in the steps taken by the recovery system are rewriting memory locations in active memory units from standby memory units, switching between active and standby copies of memory units, bus units and central processor units, and instituting progressively more pervasive initializations of all the processors in the control complex. The recovery system includes an arrangement employing a memory block parity check for fast initialization. A time shared error detection and correction activity assures that standby copies of memory units are in condition to become active when required.

93 citations


Patent
21 Nov 1980
TL;DR: In this paper, a multi-processor system includes a plurality of processors, a common shared memory, and a programmable memory access priority control circuit, where the priority information is changeable either manually, by external circuit or by at least one of the processors.
Abstract: A multi-processor system includes a plurality of processors, a common shared memory, and programmable memory access priority control circuit. The programmable memory access priority control circuits includes a programmable register circuit and a priority control circuit. The programmable register circuit stores priority information designating a memory access grade priority for each of the processors, wherein the priority information is changeable either manually, by external circuit or by at least one of the processors and remains fixed irrespective of access of the memory by any of the processors until being changed. The register circuit outputs priority information signals which indicate the memory access grade priority of each of the processors. The priority control circuit receives the priority information signals from the register means, receives a memory request signal from the processors requesting memory access (i.e. use) of the common shared memory, and outputs an acknowledge signal enabling the processor having the highest memory access grade priority to use the common shared memory in accordance with the priority information.

78 citations


Patent
Bourrez Jean-Marie1
11 Dec 1980
TL;DR: In this article, a data processing system includes plural processors, each of which derives a control signal indicating that an event has occurred which requires a change in the status of the system, as well as registers for storing signals indicative of a process being executed by the processor.
Abstract: Applicant processes to be performed on several processors in a data processing system are synchronized and allocated. The data processing system includes plural processors, each of which derives a control signal indicating that an event has occurred which requires a change in the status of the system, as well as registers for storing signals indicative of a process being executed by the processor. A memory common to the processors is selectively coupled to the processors via a bus. A circuit connected to the memory, the bus and selectively coupled to the processors selectively couples signals between a selected processor and the memory via the bus. The applicant processes are allocated and synchronized by a first circuit responsive to the control signal that allocates one of the processors to an applicant process and by second circuit that couples signals for the process being executed by the allocated processor at the time the control signal is coupled to the allocated process from the registers of the allocated processor to the memory via the data bus which thereafter couples signals for the applicant process from the memory to the registers of the allocated processors via the bus.

51 citations


Journal ArticleDOI
TL;DR: A deterministic approach to the preemptive scheduling of independent tasks, which takes into account primary memory allocation in multiprocessor systems with virtual memory and a common primary memory is proposed.
Abstract: This paper proposes a deterministic approach to the preemptive scheduling of independent tasks, which takes into account primary memory allocation in multiprocessor systems with virtual memory and a common primary memory. Each central processing unit (CPU) is assumed to have dedicated paging devices and thus paging- device competition does not exist in the system. The system workload is based on an analytic approximation to the lifetime curve of a task. Exact and approximate algorithms are presented which minimize or tend to minimize the length of schedules on an arbitrary number of identical processors. In the general case, the exact algorithm is based on nonlinear programming; however, the approximate algorithm requires the solution of several nonlinear equations with one unknown. For certain cases, analytical results have also been obtained.

42 citations


Journal ArticleDOI
TL;DR: Four families of topologies for interconnecting many identical processors into a computer network are described and investigated with respect to bus load, routing algorithms, and the relation between the average interprocessor distance and the size of the network.
Abstract: In this paper, we describe four families of topologies for interconnecting many identical processors into a computer network. Each family extends to arbitrarily many processors while keeping the number of neighbors of any one processor fixed. These families are investigated with respect to bus load, routing algorithms, and the relation between the average interprocessor distance and the size of the network.

41 citations


Patent
03 Mar 1980
TL;DR: A data security module for encrypting and decrypting computer data contains, in addition to the encryption logic, interface logic to allow direct memory access to a computer as discussed by the authors, and after being instructed as to the location and quantity of data by the computer, accesses the data directly from the computer memory without disturbing the processor to provide parallel encryption or decryption.
Abstract: A data security module for encrypting and decrypting computer data contains, in addition to the encryption logic, interface logic to allow direct memory access to a computer. The security module sits as a computer peripheral device and after being instructed as to the location and quantity of data by the computer, accesses the data directly from the computer memory without disturbing the processor to provide parallel encryption or decryption of computer memory data.

38 citations


Journal ArticleDOI
TL;DR: A multiple-read single-write (MRSW) memory is proposed as a hardware solution to the memory and bus conflict problem in distributed and multiprocessing computing systems.
Abstract: A multiple-read single-write (MRSW) memory is proposed as a hardware solution to the memory and bus conflict problem in distributed and multiprocessing computing systems. Each memory module is assigned to a host processor which is hardwired to its read–write channel. Its read-only channels are shared by a few closely coupled processors, I/O devices, and/or a data bus which provides access to all other processors. The exact processor-memory organization is determined by a module correlation criteria, which also yields a quantitative measure of the effectiveness of the solution. In a class of scientific computing problems where module correlation is limited to neighboring modules, the memory conflict problem is completely eliminated. The processors may operate as array processors controlled by a CPU, or they may operate autonomously with capabilities of originating programs or transactions. The location conflict problem of multiaccess memories is resolved without additional hardware or delay.

36 citations


Proceedings ArticleDOI
19 May 1980
TL;DR: The Database Machine (DBM) is the result of an architectural approach which distributes processing power closer to the devices on which data are stored and offloads database processing functions from the main computer [LAN79].
Abstract: The conventional physical storage mechanism of a computer system is usually comprised of a memory hierarchy that stores program and data The requirement for high performance and low cost is achieved through a combination of memories of different speeds By automatically managing the files so that the most frequently used files reside in fast storage, an overall speed comparable to the speed of the fastest memory can be achieved However, with the applications of large databases, the maintenance of large files on a conventional memory hierarchy becomes increasingly difficult Most database applications perform a small number of simple operations on a large amount of data Usually only a small fraction of the data accessed is required by the application It is more cost effective to perform database operations directly on the data in the secondary storage in order to avoid the transfer of unnecessary data across different levels of the memory hierarchy The Database Machine (DBM) is the result of an architectural approach which distributes processing power closer to the devices on which data are stored and offloads database processing functions from the main computer [LAN79]

21 citations


Patent
05 May 1980
TL;DR: In this article, a control logic system comprising at least one sub-system including a master processor, two slave processors and an interchange memory which exchanges information between the master processor and the slave processors is described.
Abstract: A control logic system comprising at least one sub-system including a master processor, two slave processors and an interchange memory which exchanges information between the master processor and the slave processors The slave processors access a common memory via an interprocessor interface

Proceedings ArticleDOI
06 May 1980
TL;DR: The EGPA-project as mentioned in this paper uses the concept of restricted neighbourhoods to solve the problems of memory access conflicts in general purpose computers, where the processors and memories are arranged in a freely extensible cellular structure and the number of connections to processors and to memories are limited for all sizes of the array.
Abstract: The contemporary general purpose computer of the Princeton type is limited in its performance by the speed of the available logic families. Many projects around the world try to achieve higher performance by using many processors connected via common memories. The hardware costs for the interconnection of many processors and memories has to be limited and a severe degradation resulting from memory access conflicts must be avoided. The EGPA-project uses the concept of restricted neighbourhoods to solve these problems. The processors and memories are arranged in a freely extensible cellular structure. The number of connections to processors and to memories are limited for all sizes of the array. In this paper we present measured results from the pilot system, which show whether memory conflicts will influence the speed of the individual processors. The results will be valid for larger arrays, too.

Journal ArticleDOI
TL;DR: Memory contention during multiprogramming when programs are free to compete for page frames is studied, finding that for high ratios of secondary to primary memory access time and under conditions of high memory contention, small programs with compact working sets are able to run with far less than expected interference from larger, more diffuse programs.
Abstract: We study memory contention during multiprogramming when programs are free to compete for page frames. A random walk between the possible partitions of memory over the set of active programs is used to model memory contention and calculate throughput. Our model of contention takes into account program characteristics by using miss ratio curves, and also considers memory size and page fetch latency. With the aid of numerous trace-driven simulations, we are able to verify our model, finding good agreement both in the observed distribution of memory among competing programs and in CPU utilization. We find that for high ratios of secondary to primary memory access time and under conditions of high memory contention, small programs with compact working sets are able to run with far less than expected interference from larger, more diffuse programs. In the case of multiprogramming the same program several times, we find that observed partition distributions are not necessarily even and that higher than expected levels of CPU use are observed. Lower ratios of access time are found to yield different results; programs compete on a more even basis and partition memory relatively more evenly than with higher ratios.

Proceedings Article
18 Aug 1980
TL;DR: This paper presents a scheme for automatically reorganizing event information in memory that is implemented in a computer program called CYRUS and shows good memory organization in large memory systems.
Abstract: Maintaining good memory organization is important in large memory systems. This paper presents a scheme for automatically reorganizing event information in memory. The processes are implemented in a computer program called CYRUS.

01 Jan 1980
TL;DR: This thesis examines the feasibility of using a tagged memory and a stack processor to implement a capability based computer and illustrates that advanced programming concepts can be implemented quite simply if the proper hardware support is provided.
Abstract: This thesis examines the feasibility of using a tagged memory and a stack processor to implement a capability based computer. The hypothesis put forth here is that these two architectural features reduce the cost of the capability mechanism and result in a simpler implementation. We begin by motivating memory protection and protection systems as a basic need of modern computer systems. A brief historical survey is presented which begins with Dennis and Van Horn's paper on the semantics of multi-programmed computations and ends with a review of the major capability machines which resulted from this paper. We then introduce the memory and processor organization proposed for this design and compare this organization with that used in previous architectures. This discussion shows that a tagged-memory organization reduces the number of segments used by a process by allowing segments to contain both pointer and data information. This reduces memory management overhead and allows a simpler representation of objects. Moreover, the tagged memory simplifies the mechanisms used to change domains, pass parameters, and address information in primary memory. The stack processor is shown to further reduce the cost of the capability mechanism by providing an inexpensive way to allocate procedure activation records, handle domain changes, and address objects on the stack. Next we present the capability mechanism for the proposed design and discuss the process of mapping capability based virtual addresses into absolute primary memory addresses, and show how the abstract type concept is directly supported by the capability mechanism. A discussion of the hardware facilities needed to support the design follows. This discussion presents a detailed view of the registers of the central processor, the organization of the process stack, the instruction set, and a possible firmware organization which could be used to implement the design. The proposed architecture is then compared with two conventional machines and shown to be more efficient in representing programs. Moreover, evidence is presented which suggests that the performance of the proposed machine would be competitive with current state-of-the-art machines. Finally, two examples are presented which use the abstract type, process synchronization, and process communication facilities provided by the design. The major contribution of this thesis is the thorough examination it provides of the tagged memory approach to capability addressing. Out research shows a tagged-memory capability machine is powerful enough to implement an operating system and the resulting operating system has a complexity about equal to that of the older generation of simple systems with little memory protection. This thesis also illustrates that advanced programming concepts can be implemented quite simply if the proper hardware support is provided. Finally, we show that the process control mechanism introduced in this thesis removes an important source of memory contention from the mutual exclusion mechanism. This is seen as a partial solution to the memory contention problem found on many multi-processor systems.

Patent
24 Jun 1980
TL;DR: In this article, the authors describe a system for simulation of logic operations comprised of an array of specially designed parallel processors, there being no theoretical limit to the number of processors which may be assembled into the array.
Abstract: A computer system for simulation of logic operations comprised of an array of specially designed parallel processors (1-31), there being no theoretical limit to the number of processors which may be assembled into the array. Each processor executes a logic simulation function wherein the logic subnetwork simulated by each processor is implicitly described by a program loaded into each processor instruction memory (202). Logic values simulated by one processor are communicated to other processors by a switching mechanism (33) controlled by a controller (32). If the array consists of i processor addresses, the switch is a full i-by-i way switch. Each processor is operated in parallel, and the major component of each processor is a first set of two memory banks (35,36) for storing the simulated logic values associated with the output of each logic block. A second set of two memory banks (38,39) are included in each processor for storing logic simulations from other processors to be combined with the logic simulation stored in the first set of memory banks.

Patent
Jean Marie Bourrez1
05 Dec 1980
TL;DR: The device includes first means for affecting system processor to process candidates for execution; said first means being controlled by the processor in which an event has occurred requiring a change in the state of the system and second means process to proceed to emptying the context of the running process, for the selected processor replace part of the new process assigned by the first means as discussed by the authors.
Abstract: The device includes first means for affecting system processor to process candidates for execution; said first means being controlled by the processor in which an event has occurred requiring a change in the state of the system and second means process to proceed to emptying the context of the running process, for the selected processor replace part of the new process assigned by the first means. Application: Information Processing System

01 Oct 1980
TL;DR: The execution of matrix multiplication in a multiprocessor system with virtual memory was evaluated by simulation, in which a memory interference model capable of dealing with priority was used to dynamically modify various job execution times according to the number of processors and I/O channels active in the system.
Abstract: : Multiprocessing is an effective architectural approach to enhance the performance of computer systems. However, various problems involved in multiprocessing may severely degrade system performance. This research has mainly centered on the memory interference problem in tightly coupled multiprocessor computer systems. Depending on the nature of the memory-requesting mechanism, discussion is centered on two important cases of such systems. The memory interference in multiprocessor systems with time-division-multiplexed (TDM) busses is first discussed. A general model for the memory interference in synchronous multiprocessor systems which allow arbitrary memory request rates, non-uniform memory references, and unequal processor priorities is presented next. Several application examples which make use of the memory interference models derived are presented. First, an algorithm is proposed for the estimation of the execution time of a program running in a multiprocessor system. Such an algorithm can be used to pick a computation decomposition which best utilizes the available computing power. A case study of the effect of computation decomposition on the performance of Gaussian Elimination is presented. The execution of matrix multiplication in a multiprocessor system with virtual memory was evaluated by simulation, in which a memory interference model capable of dealing with priority was used to dynamically modify various job execution times according to the number of processors and I/O channels active in the system.

Proceedings ArticleDOI
06 May 1980
TL;DR: KMP/II is a multiprocessor system designed to work as a multi-lingual high level language computer in a distributed processing environment and simulations are done to determine the process scheduling and processor assignment algorithm and to prove the effectiveness of the design concept.
Abstract: KMP/II is a multiprocessor system designed to work as a multi-lingual high level language computer in a distributed processing environment. The multiprocessor system is composed of up to 15 dynamically microprogrammable LSI processors. One processor executes OS functions. Another processor executes I/O functions. All the rest execute user programs as high level language job processors, emulating individual high level language. Allocation of high level language emulators to processors changes dynamically depending upon the load of user jobs in terms of the languages. Behaviour of processes of user jobs written in high level languages and system jobs are measured through monitoring typical minicomputers. Hardware characteristics of the system are also collected by measuring the prototype system. Simulations are done by a computer using the data obtained through the above measurements to determine the process scheduling and processor assignment algorithm and to prove the effectiveness of the design concept.


Proceedings ArticleDOI
06 May 1980
TL;DR: This work studies the comparison between an m-processor (multiprocessor) system and a single-processor system whose processor is m times as fast as any in the multiprocesser system, using both combinatorial and probabilistic models.
Abstract: We study the comparison between an m-processor (multiprocessor) system and a single-processor system whose processor is m times as fast as any in the multiprocessor system. The expected superiority of the single-processor system is measured in terms of mean and maximum flow times, using both combinatorial and probabilistic models.

Journal ArticleDOI
TL;DR: It is shown that the proposed organization is superior to the DP in every respect and should be considered as an alternative multiple processor organization to the distributed pipeline.
Abstract: An alternative multiple processor organization to the distributed pipeline (DP) [1] is presented. It is shown that the proposed organization is superior to the DP in every respect.

01 Oct 1980
TL;DR: An analytic model has been developed to describe the performance of a wide range of multiprocessor system configurations and workloads and provides an estimate of system throughput for various numbers of processors, jobs, and drum sectors, and for various workloads.
Abstract: : The rapid advancement in semiconductor technology continues to change the environment in which computers are designed. As hardware costs decline, systems with multiple processors become an interesting alternative to conventional single processor system. An analytic model has been developed to describe the performance of a wide range of multiprocessor system configurations and workloads. This model deals specifically with P tightly-coupled, identical processors with shared primary and secondary memory. Secondary memory consists of a paging drum with S sectors. The workload consists of J independent, identically-distributed jobs whose faulting or I/O behavior is described by both a mean (lambda faults/sector-time) and a squared coefficient of variation (K). In addition, the processing overhead for each I/O request is added to a job's execution time at a processor (C) sector-times/fault). The model developed provides an estimate of system throughput for various numbers of processors, jobs, and drum sectors, and for various workloads. The model can be used to examine the behavior of multiprocessor systems, including the sensitivity of system throughput to each of the system parameters and parameters trade-offs related to system performance. In particular, in a system with a fixed amount of memory, the addition of jobs to the system causes a change in the memory allocation for each job and thus modifies each job's faulting behavior. The above formula for throughput is useful to examine the desirability of adding or subtracting jobs in such a system.