scispace - formally typeset
Search or ask a question

Showing papers on "Distributed database published in 1983"


Journal ArticleDOI
TL;DR: It is shown that all true deadlocks are detected and that no false deadlock reported, and the algorithms can be applied in distributed database and other message communication systems.
Abstract: Distributed deadlock models are presented for resource and communication deadlocks. Simple distributed algorithms for detection of these deadlocks are given. We show that all true deadlocks are detected and that no false deadlocks are reported. In our algorithms, no process maintains global information; all messages have an identical short length. The algorithms can be applied in distributed database and other message communication systems.

449 citations


Journal ArticleDOI
TL;DR: A formal model for atomic commit protocols for a distributed database system is introduced and is used to prove existence results about resilient protocols for site failures that do not partition the network and then for partitioned networks.
Abstract: A formal model for atomic commit protocols for a distributed database system is introduced. The model is used to prove existence results about resilient protocols for site failures that do not partition the network and then for partitioned networks. For site failures, a pessimistic recovery technique, called independent recovery, is introduced and the class of failures for which resilient protocols exist is identified. For partitioned networks, two cases are studied: the pessimistic case in which messages are lost, and the optimistic case in which no messages are lost. In all cases, fundamental limitations on the resiliency of protocols are derived.

301 citations


Journal ArticleDOI
TL;DR: A new algorithm (Algorithm GENERAL) is presented to derive processing strategies for arbitrarily complex queries to minimize the response time and the total time for distributed queries.
Abstract: The efficiency of processing strategies for queries in a distributed database is critical for system performance. Methods are studied to minimize the response time and the total time for distributed queries. A new algorithm (Algorithm GENERAL) is presented to derive processing strategies for arbitrarily complex queries. Three versions of the algorithm are given: one for minimizing response time and two for minimizing total time. The algorithm is shown to provide optimal solutions under certain conditions.

266 citations


Journal ArticleDOI
Leslie Lamport1
TL;DR: It is shown that, like the original Byzantine Generals Problem, the weak version can be solved only ff fewer than one-third of the processes may fad and an approximate solution exists that can tolerate arbaranly many failures.
Abstract: The Byzantine Generals Problem requires processes to reach agreement upon a value even though some of them may fad. It is weakened by allowing them to agree upon an "incorrect" value if a failure occurs. The transaction eormmt problem for a distributed database Js a special case of the weaker problem. It is shown that, like the original Byzantine Generals Problem, the weak version can be solved only ff fewer than one-third of the processes may fad. Unlike the onginal problem, an approximate solution exists that can tolerate arbaranly many failures.

187 citations


Proceedings ArticleDOI
17 Aug 1983
TL;DR: A theory for proving the correctness of algorithms that manage replicated data is presented, an extension of serializability theory, which is applied to three replicated data algorithms: Gifford's “quorum consensus” algorithm, Eager and Sevcik’s “missing writes’ algorithm, and Computer Corporation of America's ”available copies” algorithms.
Abstract: A replicated database is a distributed database in which some data items are stored redundantly at multiple sites. The main goal is to improve system reliability. By storing critical data at multiple sites, the system can operate even though some sites have failed. However, few distributed database systems support replicated data, because it is difficult to manage as sites fail and recover.A replicated data algorithm has two parts. One is a discipline for reading and writing data item copies. The other is a concurrency control algorithm for synchronizing those operations. The read-write discipline ensures that if one transaction writes logical data item ×, and another transaction reads or writes x, there is some physical manifestation of that logical conflict. The concurrency control algorithm synchronizes physical conflicts; it knows nothing about logical conflicts. In a correct replicated data algorithm, the physical manifestation of conflicts must be strong enough so that synchronizing physical conflicts is sufficient for correctness.This paper presents a theory for proving the correctness of algorithms that manage replicated data. The theory is an extension of serializability theory. We apply it to three replicated data algorithms: Gifford's “quorum consensus” algorithm, Eager and Sevcik's “missing writes” algorithm, and Computer Corporation of America's “available copies” algorithm.

175 citations


Journal ArticleDOI
TL;DR: A level of robustness termed maximal partial operability is identified, which is the highest level attainable without significantly degrading performance under models of concurrency control and robustness.
Abstract: The problem of concurrency control in distributed database systems in which site and communication link failures may occur is considered. The possible range of failures is not restricted; in particular, failures may induce an arbitrary network partitioning. It is desirable to attain a high “level of robustness” in such a system; that is, these failures should have only a small impact on system operation.A level of robustness termed maximal partial operability is identified. Under our models of concurrency control and robustness, this robustness level is the highest level attainable without significantly degrading performance.A basis for the implementation of maximal partial operability is presented. To illustrate its use, it is applied to a distributed locking concurrency control method and to a method that utilizes timestamps. When no failures are present, the robustness modifications for these methods induce no significant additional overhead.

143 citations


Journal ArticleDOI
TL;DR: The optimal distribution of a database schema over a number of sites in a distributed network is considered and the design is driven by user-supplied information about data distribution.
Abstract: The optimal distribution of a database schema over a number of sites in a distributed network is considered. The database is modeled in terms of objects (relations or record sets) and links (predefined joins or CODASYL sets). The design is driven by user-supplied information about data distribution. The inputs required by the optimization model are: 1) cardinality and size information about objects and links, 2) a set of candidate horizontal partitions of relations into fragments and the allocations of the fragments, and 3) the specification of all important transactions, their frequencies, and their sites of origin.

139 citations


Proceedings ArticleDOI
21 Mar 1983
TL;DR: A reliability algorithm being considered for DDM, a distributed database system under development at Computer Corporation of America, is designed to tolerate clean site failures in which sites simply stop running.
Abstract: We describe a reliability algorithm being considered for DDM, a distributed database system under development at Computer Corporation of America. The algorithm is designed to tolerate clean site failures in which sites simply stop running. The algorithm allows the system to reconfigure itself to run correctly as sites fail and recover. The algorithm solves the subproblems of atomic commit and replicated data handling in an integrated manner.

73 citations


Proceedings ArticleDOI
01 May 1983
TL;DR: An algorithm to process a given query in a fragmented distributed data base environment makes use of redundant relations to reduce communication cost and a process to estimate the cost and the benefit of a semi-join, based on dynamic execution of semi-joins is introduced.
Abstract: An algorithm is given to process a given query in a fragmented distributed data base environment. Unlike previous algorithms, it has the following desired features.(1) It makes use of redundant relations to reduce communication cost;(2) a copy of each relation referenced by the query is selected so that the set of relations are contained in the minimum number of sites;(3) an efficient algorithm to process fragments is provided;(4) all relations that need not be sent to the assembly site to produce the answer are identified. Thus, unnecessary sending of these relations across sites and processing on these relations, which are common in earlier algorithms, are avoided;(5) useless semi-joins are discarded and "worse" semi-joins are replaced by better ones;(6) a process to estimate the cost and the benefit of a semi-join, based on dynamic execution of semi-joins is introduced. It is expected that the new process is more accurate than earlier estimation process.The algorithm is easy to implement and is operational.

71 citations


Proceedings ArticleDOI
21 Mar 1983
TL;DR: This paper presents solutions to allow for concurrency that are based on locking protocols and minor modifications in the data structure for extendible hash files for distributed data.
Abstract: The extendible hash file is a dynamic data structure that is an alternative to B-trees for use as a database index. While there have been many algorithms proposed to allow concurrent access to B trees similar solutions for extendible hash files have not appeared. In this paper, we present solutions to allow for concurrency that are based on locking protocols and minor modifications in the data structure.Another question that deserves consideration is whether these indexing structures can be adapted for use in a distributed database. Among the motivations for distributing data are increased availability and ease of growth, however, unless data structures in the access path are designed to support those goals, they may not be realized. We describe some first attempts at adapting extendible hash files for distributed data.

59 citations


Proceedings ArticleDOI
01 May 1983
TL;DR: The concept of "local sufficiency" is introduced as a measure of parallelism, and it is shown how certain classes of queries lead naturally to irredundant partitions of a database that are locally sufficient.
Abstract: In this paper we treat the problem of subdividing a database and allocating the fragments to the sites in a distributed database system in order to maximize non-duplicative parallelism. Our goal is to establish a conceptual framework for distributing data without being committed to specific cost models.We introduce the concept of "local sufficiency" as a measure of parallelism, and show how certain classes of queries lead naturally to irredundant partitions of a database that are locally sufficient. For classes of queries for which no irredundant distribution is locally sufficient, we offer ways to introduce redundancy in achieving local sufficiency

Proceedings ArticleDOI
01 May 1983
TL;DR: Adaplex as mentioned in this paper is an integrated language for programming database applications that supports the use of Adaplex as an interface language for distributed database managers (DDM) which is a distributed database manager based on DAPLEX.
Abstract: Adaplex is an integrated language for programming database applications. It results from the embedding of the database sublanguage DAPLEX in the general purpose programming language Ada. This paper provides an overview of the DDM: a distributed database manager (DDM) that supports the use of Adaplex as an interface language. The important technical innovations we have incorporated in the design of this system include:1. An advanced data model that captures more application semantics than conventional data models.2. Support for flexible data distribution options that improve locality of reference and efficiency of query processing.3. Extensive query optimization that combines compile time access path optimization with run time site selection.4. Efficient transaction management that reduces transaction conflicts and improves the resiliency of replicated data.5. Robust, incremental recovery management that provides for automatic recovery from certain "catastrophic" failure conditions.

Journal ArticleDOI
TL;DR: The problem of proving the correctness of execution strategies is reduced to the problem of proved the equivalence of two expressions of Multirelational Algebra, which constitutes a theoretical foundation for the design of such optimizers.
Abstract: A major requirement of a Distributed DataBase Management System (DDBMS) is to enable users to write queries as though the database were not distributed (distribution transparency). The DDBMS transforms the user's queries into execution strategies, that is, sequences of operations on the various nodes of the network and of transmissions between them. An execution strategy on a distributed database is correct if it returns the same result as if the query were applied to a nondistributed database.This paper analyzes the correctness problem for query execution strategies. A formal model, called Multirelational Algebra, is used as a unifying framework for this purpose. The problem of proving the correctness of execution strategies is reduced to the problem of proving the equivalence of two expressions of Multirelational Algebra. A set of theorems on equivalence is given in order to facilitate this task.The proposed approach can be used also for the generation of correct execution strategies, because it defines the rules which allow the transformation of a correct strategy into an equivalent one. This paper does not deal with the problem of evaluating equivalent strategies, and therefore is not in itself a proposal for a query optimizer for distributed databases. However, it constitutes a theoretical foundation for the design of such optimizers.


Journal ArticleDOI
TL;DR: Experimental results are presented which demonstrate that in certain environments and under certain important applications, locality of reference is an undeniable characteristic of the information accessing behavior of a hierarchical database management system.
Abstract: Localized information referencing is a long-known and much-exploited facet of program behavior. The existence of such behavior in the data accessing patterns produced by database management systems is not currently supported by empirical results. We present experimental results which demonstrate that in certain environments and under certain important applications, locality of reference is an undeniable characteristic of the information accessing behavior of a hierarchical database management system. Furthermore, database locality of reference is in a sense more regular, predictable, and hence, more exploitable than the localized reference activity found in programs in general. The implications of these results for the performance enhancement and workload characterization of database management systems are discussed.


Journal ArticleDOI
TL;DR: An approach to processing distributed queries that makes explicit use of redundant data is proposed and the role of data redudancy in maximizing parallelism and minimizing data movement is clarified.
Abstract: In this paper an approach to processing distributed queries that makes explicit use of redundant data is proposed. The basic idea is to focus on the dynamics of materialization, defined as the collection of data and partial results available for processing at any given time, as query processing proceeds. In this framework the role of data redudancy in maximizing parallelism and minimizing data movement is clarified. What results is not only the discovery of new algorithms but an improved framework for their evaluation.

Journal ArticleDOI
TL;DR: This work describes a proof schema for analyzing concurrency control correctness and illustrates the proof schema by presenting two new concurrency algorithms for distributed database systems.
Abstract: Concurrency control algorithms for database systems are usually regarded as methods for synchronizing Read and Write operations. Such methods are judged to be correct if they only produce serializable executions. However, Reads and Writes are sometimes inaccurate models of the operations executed by a database system. In such cases, serializability does not capture all aspects of concurrency control executions. To capture these aspects, we describe a proof schema for analyzing concurrency control correctness. We illustrate the proof schema by presenting two new concurrency algorithms for distributed database systems.

Proceedings Article
31 Oct 1983
TL;DR: DODM is described, a simple model for object sharing in distributed database systems that provides a small set of operations for object definition, manipulation, and retrieval in a distributed environment.
Abstract: This paper describes DODM, a simple model for object sharing in distributed database systems. The model provides a small set of operations for object definition, manipulation, and retrieval in a distributed environment. Relationships among oojects can be established across database boundarres, objects are relocatable within the distributed environment, and mechanisms are provided for object sharing among individual databases. An object naming convention supports location transparent object references; that is, objects can be referenced by user-defined names rather than by address. The primitive operations introduced can be used as the basis for the specification and stepwise development of database models and database systems of increasing complexity. An example is provided to illustrate the use of DODM in the design of a distributed database system supporting a semantically expressive database model. 1. Int reduction Distributed computing systems are becoming increasingly common. This trend is largely caused by the decreasing cost of hardware: not only are powerful personal computers becoming so Inexpensive today that individuals can afford them for personal use, but the cost of computer networks that enable computer systems to exchange information at a very high rate is decreasing drastically. Decentralization overcomes many of the limitations and deficiencies of centralized systems. A network ‘This research was supported, in part, by the Joint Sewlees Electronics Program through the Air Force Office of ScientWic Research under contract F49820-El-C.0070. of computers simply provides a higher level of performance, availability, reliability, fault tolerance, and security than a centralized computer system. In addition to the technical advantages that make decentralized systems feasible, social attitudes tend to indicate that a collection of smaller, autonomous computer systems are preferable to large central systems. The growing popularity of distributed computing establishes a need for mechanisms that allow individual users to communicate with each other and share both hardware and software resources. Individual users also need access to the growing number of “public” databases, which contain a variety of information such as grocery prices, the values of stocks, and the histories of bank accounts. Of course, sharing mechanisms decrease the autonomy of the components of the distributed environment, and affect the performance, availability, reliability, fault tolerance, and security of the total system. Sharing and communication mechanisms alsc introduce data transmission and naming problems. Most current approaches to distributed database management system design fail to adequately address issues concerning location transparency (the ability to reference data by name rather than by address), logical decentralization, catalog management, and the uniform handling of meta-data and user-data. Logically centralized database systems [Rothnie 80, Stonebraker 77, Andler 821 provide the users with a single integrated database schema describing all the data in the physically centralized or distributed environment. Recent research has also resulted in approaches to support the integration of heterogeneous as well as homogeneous (preexisting) databases [Motro 81, Smith 81, Litwin 81, Kimbleton 791. However, a critical remaining problem is accommodating informalion sharing among individual, autonomous databases. Finally, existing distributed database system architectures that emphasize the autonomy of the individual databases [Heimbigner 82, Williams 81, Tsichritzis 821 require centralized or complex catalog management. The aim of the research described in this paper is to define a simple model for object sharing in distributed database systems. This is done by stepwise development of a series of object-oriented models. First, a simple model called ODM (for object-orienfed databsse model) is defined. ODM provides a

Journal ArticleDOI
TL;DR: In this paper, the authors discuss some of the issues raised in the implementation of a distributed database management system by the requirements of site autonomy in the context of the R ∗ research project at IBM's San Jose Research Lab.

Proceedings Article
31 Oct 1983
TL;DR: This paper re-examines the file allocation problem and presents an approach to three versions of the problem, thus demonstrating the flexibility of the approach and arguing that the method provides a practical solution to the problem.
Abstract: In this paper, we re-examine the file allocation problem. Because of changing technology, the assumptions we use here are different from those of previous researchers. Specifically, the interaction of files during processing of queries is explicitly incorperated into our model and the cost of communication between two sites is dominated by the amount of data transfer and is independent of the receiving and the sending sites. We study the complexity of the file allocation problem using the new model. Unfortunateiy, the problem is NP-hard. We Present an approach to three versions of the problem, thus demonstrating the flexibility of our approach. we further argue that our method provides a practical solution to the problem, because accurate solutions are obtained, the time complexity cf our algorithm is much smaller than existing algorithms, the algorithm is conceptually simple, easy to implement and is adaptive to users’ changing access patterns.

Journal ArticleDOI
TL;DR: An event order based model for specifying and analyzing concurrency control algorithms for distributed database systems has been presented in this article, where an expanded notion of history that includes the database access events as well as synchronization events is used to study the correctness, degree of concurrency, and other aspects of the algorithms such as deadlocks and reliability.
Abstract: An event order based model for specifying and analyzing concurrency control algorithms for distributed database systems has been presented. An expanded notion of history that includes the database access events as well as synchronization events is used to study the correctness, degree of concurrency, and other aspects of the algorithms such as deadlocks and reliability. The algorithms are mapped into serializable classes that have been defined based on the order of synchronization events such as lock points, commit point, arrival of a transaction, etc,.

Proceedings ArticleDOI
16 May 1983
TL;DR: This paper is a discussion of the application requirements, the design and architecture of the database, and the algorithms used for updates and inserts of EMPACT#8482.
Abstract: EMPACT™, TANDEM Computers' manufacturing information control system, is an application that uses a distributed database. The requirements of the system include supporting multiple sites, providing continuous availability to the data, and controlling updates to communal information. The approach taken to satisfy these needs involves the use of both partitioned and replicated data. Presented in this paper is a discussion of the application requirements, the design and architecture of the database, and the algorithms used for updates and inserts.

Proceedings ArticleDOI
16 May 1983
TL;DR: The optimization minimizes the number of disk accesses by taking advantage of the access paths available to the CODASYL local database management systems and the relationship information of the variables used in the relational commands.
Abstract: A new query translation and optimization algorithm is presented. The algorithm is being implemented as the local query translation and optimization technique of Honeywell's Distributed Database Testbed System (DDTS). The algorithm translates local queries expressed in representational schemas (relational) to their equivalent internal schemas (network). The technique is new in that it does not translate each relational command in isolation, but rather attempts to find a collection of relational commands for which an optimized sequence of CODASYL DML commands can be generated. The optimization minimizes the number of disk accesses by taking advantage of the access paths available to the CODASYL local database management systems and the relationship information of the variables used in the relational commands.

Proceedings Article
31 Oct 1983
TL;DR: This paper develops and evaluates algorithms that perform the partitioning and allocation of the database over the processor nodes of the network in a computationally feasible manner and proposes a mixed benefit evaluation strategy.
Abstract: In a distributed database system the partitioning and allocation of the database over the processor nodes of the network can be a critical aspect of the database design effort. In this paper we develop and evaluate algorithms that perform this task in a computationally feasible manner. The network we consider is characterized by a relatively high communication bandwidth, considering the processing and input output capacities in its processors. Such a balance is typical if the processors are connected via busses or local networks. The common constraint that transactions have a specific root node no longer exists, so that there are more distribution choices. However, a poor distribution leads to less efficient computation, higher costs, and higher loads in the nodes or in the communication network so that the system may not be able to handle the required set of transactions.Our approach is to first split the database into fragments which constitute appropriate units for allocation. The fragments to be allocated are selected based on maximal benefit criteria using a greedy heuristic. The assignment to processor nodes uses a first-fit algorithm. The complete algorithm, called GFF, is stated in a procedural form.The complexity of the problem and of its candidate solutions are analyzed and several interesting relationships are proven. Alternate benefit metrics are considered, since the execution cost of the allocation procedure varies by orders of magnitude with the alternatives of benefit evaluation. A mixed benefit evaluation strategy is eventually proposed.A model for evaluation is presented. Two of the strategies are experimentally evaluated, and the reported results support the discussion. The approach should be suitable for other cases where resources have to be allocated subject to resource constraints.

Proceedings ArticleDOI
07 Dec 1983
TL;DR: The concept of a federative database server is introduced and is shown to be an excellent solution in many cases and three possible architectures for such a database system are presented.
Abstract: Recently, more and more personal computers are getting linked by means of a local area network. This calls for network-wide services, among others for a network database service. However, the question of an appropriate architecture for such a database system is quite open. This paper establishes a set of criteria for this evaluation and presents three possible architectures. Among them, the concept of a federative database server is introduced and is shown to be an excellent solution in many cases. Hardware dependencies are Identified and hints for optimal decisions in different hardware environments are given. Finally, the concept chosen by the authors for implementation is presented.

Journal ArticleDOI
10 Oct 1983
TL;DR: R* as mentioned in this paper is an experimental prototype distributed database management system, which uses virtual circuit communication links to connect the tree of processes in an R* computation to provide message ordering, flow control, and error detection reporting.
Abstract: R* is an experimental prototype distributed database management system. The computation needed to perform a sequence of multisite user transactions in R* is structured as a tree of processes communicating over virtual circuit communication links. Distributed computation can be supported by providing a server process per site which performs requests on behalf of remote users. Alternatively, a new process could be created to service each incoming request. Instead of using a shared server process or using the process per request approach, R* creates a process associated with the computation of the user on the first request to the remote site. This process is incorporated into the tree of processes serving a single user and is retained for the duration of the user computation. This approach allows R* to factor some of the request execution overhead into the process creation phase, and simplifies the retention of user and transaction context at the multiple sites of the distributed computation.R* uses virtual circuit communication links connect the tree of processes in an R* computation. Virtual circuits provide message ordering, flow control, and error detection reporting. Especially important in the distributed transaction processing environment is ability of the virtual circuit facility to de- and report any process, processor, or communication failures to the end points of virtual circuit. Error detection and report-by the virtual circuit facility is used to manage the tree of processes comprising a computation and to handle correctly the resolution of distributed transactions in the presence of various kinds of failures.R* uses the communication facility in a variety of ways. Many functions use a syn-chronous, remote procedure call protocol to perform work at remote ites. Site authentication, user identification, data definition, and database catalog management are all implemented using this remote procedure activation protocol. Query planning, on the other hand, distributes query execution plans in parallel to the sites involved. Parallel plan distribution allows server sites to overlap the computation needed to validate and store query execution plans. The query execution plans often in- volve passing data streams from site, to site with each site transforming the data stream in some way. The execution of data access requests exploits virtual circuit flow controls to allow overlapped execution at data producer and data consumer sites.Finally, distributed transaction management in R* uses the virtual circuits connecting the process tree to exchange the messages of the two-phase commit protocol. However, if a failure occurs during the commit protocol, the virtual circuits and process of the original computation may be lost. When failures inter-rupt the distributed commit protocol, R* reverts ' to a datagram-oriented protocol to transfer the messages needed to resolve the outstanding transaction. R* also uses datagrams to communicate the information needed to detect multi-site deadlocks.The R* approach to distributed computation may be contrasted with datagram-based and server-oriented distributed systems. The retention of remote processes, and the virtual circuits connecting them, for the duration of the user computation improves execution performance whenever repeated accesses are made to a remote site. The retention of remote processes is also helpful for maintaining user and transaction context between requests to the remote site. The use of virtual circuits allows several concerns, such as message ordering and flow control, to be relegated to the network and virtual circuit implementation. The ability of the virtual circuit implementation to report failures is fundamental to the management of the R* distributed computation.Currently, R* is running on multiple processors and is able to perform any SQL statement on local or remote data. This includes not only data definition and catalog manipulation statements, but also n-way joins, subqueries, and data update statements. Besides the SQL language constructs, the transaction management and distributed deadlock detection protocols are implemented and running. The tree structure of the R* computation and the use of virtual circuits have proved to be quite well adapted to the problems of implementing and controlling the complex, distributed computations needed to support the execution of a distributed data-base management system.

Journal ArticleDOI
TL;DR: The types of undesirable events that occur in a database environment are classified and the necessary recovery information, with subsequent actions to recover the correct state of the database, is summarized.
Abstract: The need for robust recovery facilities in modern database management systems is quite well known. Various authors have addressed recovery facilities and specific techniques, but none have delved into the problem of recovery in database machines. In this paper, the types of undesirable events that occur in a database environment are classified and the necessary recovery information, with subsequent actions to recover the correct state of the database, is summarized. A model of the “processor-per-track” class of parallel associative database processor is presented. Three different types of recovery mechanisms that may be considered for parallel associative database processors are identified. For each architecture, both the workload imposed by the recovery mechanisms on the execution of database operations (i.e., retrieve, modify, delete, and insert) and the workload involved in the recovery actions (i.e., rollback, restart, restore, and reconstruct) are analyzed. The performance of the three architectures is quantitatively compared. This comparison is made in terms of the number of extra revolutions of the database area required to process a transaction versus the number of records affected by a transaction. A variety of different design parameters of the database processor, of the database, and of a mix of transaction types (modify, insert, and delete) are considered. A large number of combinations is selected and the effects of the parameters on the extra processing time are identified.

Proceedings ArticleDOI
16 May 1983
TL;DR: A comparative evaluation of the Goodyear massively parallel processor and an abstract conventional computer is examined by examining specific database management functions rather than an entire database management system.
Abstract: The Goodyear massively parallel processor (MPP) represents a new architecture with the potential for providing improved solutions to applications benefiting from highly parallel operation. In this paper, application of the MPP to database management systems is examined. Specifically, the relational database model is considered. Database management has been selected as a candidate application of the MPP because of the positive results achieved in previous work related to parallel architectures and database systems. The relational model has been selected for its applicability to parallel processing, its mathematical foundation, and its general recognition as a model that is superior in many respects to the hierarchical and network models. The paper concentrates on a comparative evaluation of the MPP and an abstract conventional computer by examining specific database management functions rather than an entire database management system.