Showing papers on "Distributed database published in 1983"

PDF

Open Access

Journal Article•DOI•

[...]

K. Mani Chandy¹, Jayadev Misra¹, Laura M. Haas²•Institutions (2)

01 May 1983-ACM Transactions on Computer Systems

TL;DR: It is shown that all true deadlocks are detected and that no false deadlock reported, and the algorithms can be applied in distributed database and other message communication systems.

...read moreread less

Abstract: Distributed deadlock models are presented for resource and communication deadlocks. Simple distributed algorithms for detection of these deadlocks are given. We show that all true deadlocks are detected and that no false deadlocks are reported. In our algorithms, no process maintains global information; all messages have an identical short length. The algorithms can be applied in distributed database and other message communication systems.

...read moreread less

449 citations

Journal Article•DOI•

A Formal Model of Crash Recovery in a Distributed System

[...]

Dale Skeen¹, Michael Stonebraker²•Institutions (2)

Cornell University¹, University of California, Berkeley²

01 May 1983-IEEE Transactions on Software Engineering

TL;DR: A formal model for atomic commit protocols for a distributed database system is introduced and is used to prove existence results about resilient protocols for site failures that do not partition the network and then for partitioned networks.

...read moreread less

Abstract: A formal model for atomic commit protocols for a distributed database system is introduced. The model is used to prove existence results about resilient protocols for site failures that do not partition the network and then for partitioned networks. For site failures, a pessimistic recovery technique, called independent recovery, is introduced and the class of failures for which resilient protocols exist is identified. For partitioned networks, two cases are studied: the pessimistic case in which messages are lost, and the optimistic case in which no messages are lost. In all cases, fundamental limitations on the resiliency of protocols are derived.

...read moreread less

301 citations

Journal Article•DOI•

Optimization Algorithms for Distributed Queries

[...]

Peter M. G. Apers¹, Alan R. Hevner, S.B. Yao•Institutions (1)

VU University Amsterdam¹

01 Jan 1983-IEEE Transactions on Software Engineering

TL;DR: A new algorithm (Algorithm GENERAL) is presented to derive processing strategies for arbitrarily complex queries to minimize the response time and the total time for distributed queries.

...read moreread less

Abstract: The efficiency of processing strategies for queries in a distributed database is critical for system performance. Methods are studied to minimize the response time and the total time for distributed queries. A new algorithm (Algorithm GENERAL) is presented to derive processing strategies for arbitrarily complex queries. Three versions of the algorithm are given: one for minimizing response time and two for minimizing total time. The algorithm is shown to provide optimal solutions under certain conditions.

...read moreread less

266 citations

Journal Article•DOI•

The Weak Byzantine Generals Problem

[...]

Leslie Lamport¹•Institutions (1)

SRI International¹

01 Jul 1983-Journal of the ACM

TL;DR: It is shown that, like the original Byzantine Generals Problem, the weak version can be solved only ff fewer than one-third of the processes may fad and an approximate solution exists that can tolerate arbaranly many failures.

...read moreread less

Abstract: The Byzantine Generals Problem requires processes to reach agreement upon a value even though some of them may fad. It is weakened by allowing them to agree upon an "incorrect" value if a failure occurs. The transaction eormmt problem for a distributed database Js a special case of the weaker problem. It is shown that, like the original Byzantine Generals Problem, the weak version can be solved only ff fewer than one-third of the processes may fad. Unlike the onginal problem, an approximate solution exists that can tolerate arbaranly many failures.

...read moreread less

187 citations

Proceedings Article•DOI•

The failure and recovery problem for replicated databases

[...]

Philip A. Bernstein, Nathan Goodman

17 Aug 1983

TL;DR: A theory for proving the correctness of algorithms that manage replicated data is presented, an extension of serializability theory, which is applied to three replicated data algorithms: Gifford's “quorum consensus” algorithm, Eager and Sevcik’s “missing writes’ algorithm, and Computer Corporation of America's ”available copies” algorithms.

...read moreread less

Abstract: A replicated database is a distributed database in which some data items are stored redundantly at multiple sites. The main goal is to improve system reliability. By storing critical data at multiple sites, the system can operate even though some sites have failed. However, few distributed database systems support replicated data, because it is difficult to manage as sites fail and recover.A replicated data algorithm has two parts. One is a discipline for reading and writing data item copies. The other is a concurrency control algorithm for synchronizing those operations. The read-write discipline ensures that if one transaction writes logical data item ×, and another transaction reads or writes x, there is some physical manifestation of that logical conflict. The concurrency control algorithm synchronizes physical conflicts; it knows nothing about logical conflicts. In a correct replicated data algorithm, the physical manifestation of conflicts must be strong enough so that synchronizing physical conflicts is sufficient for correctness.This paper presents a theory for proving the correctness of algorithms that manage replicated data. The theory is an extension of serializability theory. We apply it to three replicated data algorithms: Gifford's “quorum consensus” algorithm, Eager and Sevcik's “missing writes” algorithm, and Computer Corporation of America's “available copies” algorithm.

...read moreread less

175 citations

Journal Article•DOI•

Achieving robustness in distributed database systems

[...]

Derek L. Eager¹, Kenneth C. Sevcik¹•Institutions (1)

University of Toronto¹

01 Sep 1983-ACM Transactions on Database Systems

TL;DR: A level of robustness termed maximal partial operability is identified, which is the highest level attainable without significantly degrading performance under models of concurrency control and robustness.

...read moreread less

Abstract: The problem of concurrency control in distributed database systems in which site and communication link failures may occur is considered. The possible range of failures is not restricted; in particular, failures may induce an arbitrary network partitioning. It is desirable to attain a high “level of robustness” in such a system; that is, these failures should have only a small impact on system operation.A level of robustness termed maximal partial operability is identified. Under our models of concurrency control and robustness, this robustness level is the highest level attainable without significantly degrading performance.A basis for the implementation of maximal partial operability is presented. To illustrate its use, it is applied to a distributed locking concurrency control method and to a method that utilizes timestamps. When no failures are present, the robustness modifications for these methods induce no significant additional overhead.

...read moreread less

143 citations

Journal Article•DOI•

Distribution Design of Logical Database Schemas

[...]

Stefano Ceri¹, Shamkant B. Navathe², Gio Wiederhold³•Institutions (3)

Polytechnic University of Milan¹, University of Florida², Stanford University³

01 Jul 1983-IEEE Transactions on Software Engineering

TL;DR: The optimal distribution of a database schema over a number of sites in a distributed network is considered and the design is driven by user-supplied information about data distribution.

...read moreread less

Abstract: The optimal distribution of a database schema over a number of sites in a distributed network is considered. The database is modeled in terms of objects (relations or record sets) and links (predefined joins or CODASYL sets). The design is driven by user-supplied information about data distribution. The inputs required by the optimization model are: 1) cardinality and size information about objects and links, 2) a set of candidate horizontal partitions of relations into fragments and the allocations of the fragments, and 3) the specification of all important transactions, their frequencies, and their sites of origin.

...read moreread less

139 citations

Proceedings Article•DOI•

A recovery algorithm for a distributed database system

[...]

Nathan Goodman, Dale Skeen, Arvola Chan, Umeshwar Dayal, Stephen Fox, Daniel R. Ries - Show less +2 more

21 Mar 1983

TL;DR: A reliability algorithm being considered for DDM, a distributed database system under development at Computer Corporation of America, is designed to tolerate clean site failures in which sites simply stop running.

...read moreread less

Abstract: We describe a reliability algorithm being considered for DDM, a distributed database system under development at Computer Corporation of America. The algorithm is designed to tolerate clean site failures in which sites simply stop running. The algorithm allows the system to reconfigure itself to run correctly as sites fail and recover. The algorithm solves the subproblems of atomic commit and replicated data handling in an integrated manner.

...read moreread less

73 citations

Proceedings Article•DOI•

On the design of a query processing strategy in a distributed database environment

[...]

Clement Yu¹, C. C. Chang¹•Institutions (1)

University of Illinois at Chicago¹

01 May 1983

TL;DR: An algorithm to process a given query in a fragmented distributed data base environment makes use of redundant relations to reduce communication cost and a process to estimate the cost and the benefit of a semi-join, based on dynamic execution of semi-joins is introduced.

...read moreread less

Abstract: An algorithm is given to process a given query in a fragmented distributed data base environment. Unlike previous algorithms, it has the following desired features.(1) It makes use of redundant relations to reduce communication cost;(2) a copy of each relation referenced by the query is selected so that the set of relations are contained in the minimum number of sites;(3) an efficient algorithm to process fragments is provided;(4) all relations that need not be sent to the assembly site to produce the answer are identified. Thus, unnecessary sending of these relations across sites and processing on these relations, which are common in earlier algorithms, are avoided;(5) useless semi-joins are discarded and "worse" semi-joins are replaced by better ones;(6) a process to estimate the cost and the benefit of a semi-join, based on dynamic execution of semi-joins is introduced. It is expected that the new process is more accurate than earlier estimation process.The algorithm is easy to implement and is operational.

...read moreread less

71 citations

Proceedings Article•DOI•

Extendible hashing for concurrent operations and distributed data

[...]

Carla Schlatter Ellis¹•Institutions (1)

University of Rochester¹

21 Mar 1983

TL;DR: This paper presents solutions to allow for concurrency that are based on locking protocols and minor modifications in the data structure for extendible hash files for distributed data.

...read moreread less

Abstract: The extendible hash file is a dynamic data structure that is an alternative to B-trees for use as a database index. While there have been many algorithms proposed to allow concurrent access to B trees similar solutions for extendible hash files have not appeared. In this paper, we present solutions to allow for concurrency that are based on locking protocols and minor modifications in the data structure.Another question that deserves consideration is whether these indexing structures can be adapted for use in a distributed database. Among the motivations for distributing data are increased availability and ease of growth, however, unless data structures in the access path are designed to support those goals, they may not be realized. We describe some first attempts at adapting extendible hash files for distributed data.

...read moreread less

59 citations

Proceedings Article•DOI•

Distributing a database for parallelism

[...]

Eugene Wong¹, Randy H. Katz¹•Institutions (1)

University of California, Berkeley¹

01 May 1983

TL;DR: The concept of "local sufficiency" is introduced as a measure of parallelism, and it is shown how certain classes of queries lead naturally to irredundant partitions of a database that are locally sufficient.

...read moreread less

Abstract: In this paper we treat the problem of subdividing a database and allocating the fragments to the sites in a distributed database system in order to maximize non-duplicative parallelism. Our goal is to establish a conceptual framework for distributing data without being committed to specific cost models.We introduce the concept of "local sufficiency" as a measure of parallelism, and show how certain classes of queries lead naturally to irredundant partitions of a database that are locally sufficient. For classes of queries for which no irredundant distribution is locally sufficient, we offer ways to introduce redundancy in achieving local sufficiency

...read moreread less

Proceedings Article•DOI•

Overview of an Ada compatible distributed database manager

[...]

Arvola Chan, Umeshwar Dayal, Stephen Fox, Nathan Goodman, Daniel R. Ries, Dale Skeen - Show less +2 more

01 May 1983

TL;DR: Adaplex as mentioned in this paper is an integrated language for programming database applications that supports the use of Adaplex as an interface language for distributed database managers (DDM) which is a distributed database manager based on DAPLEX.

...read moreread less

Abstract: Adaplex is an integrated language for programming database applications. It results from the embedding of the database sublanguage DAPLEX in the general purpose programming language Ada. This paper provides an overview of the DDM: a distributed database manager (DDM) that supports the use of Adaplex as an interface language. The important technical innovations we have incorporated in the design of this system include:1. An advanced data model that captures more application semantics than conventional data models.2. Support for flexible data distribution options that improve locality of reference and efficiency of query processing.3. Extensive query optimization that combines compile time access path optimization with run time site selection.4. Efficient transaction management that reduces transaction conflicts and improves the resiliency of replicated data.5. Robust, incremental recovery management that provides for automatic recovery from certain "catastrophic" failure conditions.

...read moreread less

Journal Article•DOI•

Correctness of query execution strategies in distributed databases

[...]

Stefano Ceri¹, Giuseppe Pelagatti¹•Institutions (1)

Polytechnic University of Milan¹

01 Dec 1983-ACM Transactions on Database Systems

TL;DR: The problem of proving the correctness of execution strategies is reduced to the problem of proved the equivalence of two expressions of Multirelational Algebra, which constitutes a theoretical foundation for the design of such optimizers.

...read moreread less

Abstract: A major requirement of a Distributed DataBase Management System (DDBMS) is to enable users to write queries as though the database were not distributed (distribution transparency). The DDBMS transforms the user's queries into execution strategies, that is, sequences of operations on the various nodes of the network and of transmissions between them. An execution strategy on a distributed database is correct if it returns the same result as if the query were applied to a nondistributed database.This paper analyzes the correctness problem for query execution strategies. A formal model, called Multirelational Algebra, is used as a unifying framework for this purpose. The problem of proving the correctness of execution strategies is reduced to the problem of proving the equivalence of two expressions of Multirelational Algebra. A set of theorems on equivalence is given in order to facilitate this task.The proposed approach can be used also for the generation of correct execution strategies, because it defines the rules which allow the transformation of a correct strategy into an equivalent one. This paper does not deal with the problem of evaluating equivalent strategies, and therefore is not in itself a proposal for a query optimizer for distributed databases. However, it constitutes a theoretical foundation for the design of such optimizers.

...read moreread less

Journal Article•DOI•

Asynchronous Procedures for Parallel Processing

[...]

Talukdar, Pyo, Giras

01 Jan 1983-IEEE Power & Energy Magazine

Journal Article•DOI•

Locality of Reference in Hierarchical Database Systems

[...]

John P. Kearns¹, S. DeFazio¹•Institutions (1)

University of Pittsburgh¹

01 Mar 1983-IEEE Transactions on Software Engineering

TL;DR: Experimental results are presented which demonstrate that in certain environments and under certain important applications, locality of reference is an undeniable characteristic of the information accessing behavior of a hierarchical database management system.

...read moreread less

Abstract: Localized information referencing is a long-known and much-exploited facet of program behavior. The existence of such behavior in the data accessing patterns produced by database management systems is not currently supported by empirical results. We present experimental results which demonstrate that in certain environments and under certain important applications, locality of reference is an undeniable characteristic of the information accessing behavior of a hierarchical database management system. Furthermore, database locality of reference is in a sense more regular, predictable, and hence, more exploitable than the localized reference activity found in programs in general. The implications of these results for the performance enhancement and workload characterization of database management systems are discussed.

...read moreread less

Book•

Query processing and data allocation in distributed database systems

[...]

Peter M. G. Apers

01 Jan 1983

Journal Article•DOI•

Dynamic Rematerialization: Processing Distributed Queries Using Redundant Data

[...]

Eugene Wong¹•Institutions (1)

University of California, Berkeley¹

01 May 1983-IEEE Transactions on Software Engineering

TL;DR: An approach to processing distributed queries that makes explicit use of redundant data is proposed and the role of data redudancy in maximizing parallelism and minimizing data movement is clarified.

...read moreread less

Abstract: In this paper an approach to processing distributed queries that makes explicit use of redundant data is proposed. The basic idea is to focus on the dynamics of materialization, defined as the collection of data and partial results available for processing at any given time, as query processing proceeds. In this framework the role of data redudancy in maximizing parallelism and minimizing data movement is clarified. What results is not only the discovery of new algorithms but an improved framework for their evaluation.

...read moreread less

Journal Article•DOI•

Analyzing Concurrency Control Algorithms When User and System Operations Differ

[...]

Philip A. Bernstein¹, Nathan Goodman, Ming-Yee Lai•Institutions (1)

Harvard University¹

01 May 1983-IEEE Transactions on Software Engineering

TL;DR: This work describes a proof schema for analyzing concurrency control correctness and illustrates the proof schema by presenting two new concurrency algorithms for distributed database systems.

...read moreread less

Abstract: Concurrency control algorithms for database systems are usually regarded as methods for synchronizing Read and Write operations. Such methods are judged to be correct if they only produce serializable executions. However, Reads and Writes are sometimes inaccurate models of the operations executed by a database system. In such cases, serializability does not capture all aspects of concurrency control executions. To capture these aspects, we describe a proof schema for analyzing concurrency control correctness. We illustrate the proof schema by presenting two new concurrency algorithms for distributed database systems.

...read moreread less

Proceedings Article•

An Approach to Object Sharing in Distributed Datbase Systems

[...]

Peter Lyngbaek, Dennis McLeod

31 Oct 1983

TL;DR: DODM is described, a simple model for object sharing in distributed database systems that provides a small set of operations for object definition, manipulation, and retrieval in a distributed environment.

...read moreread less

Abstract: This paper describes DODM, a simple model for object sharing in distributed database systems. The model provides a small set of operations for object definition, manipulation, and retrieval in a distributed environment. Relationships among oojects can be established across database boundarres, objects are relocatable within the distributed environment, and mechanisms are provided for object sharing among individual databases. An object naming convention supports location transparent object references; that is, objects can be referenced by user-defined names rather than by address. The primitive operations introduced can be used as the basis for the specification and stepwise development of database models and database systems of increasing complexity. An example is provided to illustrate the use of DODM in the design of a distributed database system supporting a semantically expressive database model. 1. Int reduction Distributed computing systems are becoming increasingly common. This trend is largely caused by the decreasing cost of hardware: not only are powerful personal computers becoming so Inexpensive today that individuals can afford them for personal use, but the cost of computer networks that enable computer systems to exchange information at a very high rate is decreasing drastically. Decentralization overcomes many of the limitations and deficiencies of centralized systems. A network ‘This research was supported, in part, by the Joint Sewlees Electronics Program through the Air Force Office of ScientWic Research under contract F49820-El-C.0070. of computers simply provides a higher level of performance, availability, reliability, fault tolerance, and security than a centralized computer system. In addition to the technical advantages that make decentralized systems feasible, social attitudes tend to indicate that a collection of smaller, autonomous computer systems are preferable to large central systems. The growing popularity of distributed computing establishes a need for mechanisms that allow individual users to communicate with each other and share both hardware and software resources. Individual users also need access to the growing number of “public” databases, which contain a variety of information such as grocery prices, the values of stocks, and the histories of bank accounts. Of course, sharing mechanisms decrease the autonomy of the components of the distributed environment, and affect the performance, availability, reliability, fault tolerance, and security of the total system. Sharing and communication mechanisms alsc introduce data transmission and naming problems. Most current approaches to distributed database management system design fail to adequately address issues concerning location transparency (the ability to reference data by name rather than by address), logical decentralization, catalog management, and the uniform handling of meta-data and user-data. Logically centralized database systems [Rothnie 80, Stonebraker 77, Andler 821 provide the users with a single integrated database schema describing all the data in the physically centralized or distributed environment. Recent research has also resulted in approaches to support the integration of heterogeneous as well as homogeneous (preexisting) databases [Motro 81, Smith 81, Litwin 81, Kimbleton 791. However, a critical remaining problem is accommodating informalion sharing among individual, autonomous databases. Finally, existing distributed database system architectures that emphasize the autonomy of the individual databases [Heimbigner 82, Williams 81, Tsichritzis 821 require centralized or complex catalog management. The aim of the research described in this paper is to define a simple model for object sharing in distributed database systems. This is done by stepwise development of a series of object-oriented models. First, a simple model called ODM (for object-orienfed databsse model) is defined. ODM provides a

...read moreread less

Journal Article•DOI•

Site autonomy issues in R*: a distributed database management system

[...]

Patricia G. Selinger¹, Dean Spencer Daniels¹, Laura M. Haas¹, Bruce G. Lindsay¹, Pui Ng¹, Paul F. Wilms¹, Robert A. Yost¹ - Show less +3 more•Institutions (1)

IBM¹

01 May 1983-Information Sciences

TL;DR: In this paper, the authors discuss some of the issues raised in the implementation of a distributed database management system by the requirements of site autonomy in the context of the R ∗ research project at IBM's San Jose Research Lab.

...read moreread less

Proceedings Article•

File Allocation in Distributed Databases with Interaction between Files

[...]

Clement Yu, M. K. Siu, Kam-Yiu Lam, C. H. Chen

31 Oct 1983

TL;DR: This paper re-examines the file allocation problem and presents an approach to three versions of the problem, thus demonstrating the flexibility of the approach and arguing that the method provides a practical solution to the problem.

...read moreread less

Abstract: In this paper, we re-examine the file allocation problem. Because of changing technology, the assumptions we use here are different from those of previous researchers. Specifically, the interaction of files during processing of queries is explicitly incorperated into our model and the cost of communication between two sites is dominated by the amount of data transfer and is independent of the receiving and the sending sites. We study the complexity of the file allocation problem using the new model. Unfortunateiy, the problem is NP-hard. We Present an approach to three versions of the problem, thus demonstrating the flexibility of our approach. we further argue that our method provides a practical solution to the problem, because accurate solutions are obtained, the time complexity cf our algorithm is much smaller than existing algorithms, the algorithm is conceptually simple, easy to implement and is adaptive to users’ changing access patterns.

...read moreread less

Journal Article•DOI•

A Causal Model for Analyzing Distributed Concurrency Control Algorithms

[...]

Bharat Bhargava¹, Cecil T. Hua•Institutions (1)

University of Pittsburgh¹

01 Jul 1983-IEEE Transactions on Software Engineering

TL;DR: An event order based model for specifying and analyzing concurrency control algorithms for distributed database systems has been presented in this article, where an expanded notion of history that includes the database access events as well as synchronization events is used to study the correctness, degree of concurrency, and other aspects of the algorithms such as deadlocks and reliability.

...read moreread less

Abstract: An event order based model for specifying and analyzing concurrency control algorithms for distributed database systems has been presented. An expanded notion of history that includes the database access events as well as synchronization events is used to study the correctness, degree of concurrency, and other aspects of the algorithms such as deadlocks and reliability. The algorithms are mapped into serializable classes that have been defined based on the order of synchronization events such as lock points, commit point, arrival of a transaction, etc,.

...read moreread less

Proceedings Article•DOI•

EMPACT™: a distributed database application

[...]

Alan Norman, Mark Anderton

16 May 1983

TL;DR: This paper is a discussion of the application requirements, the design and architecture of the database, and the algorithms used for updates and inserts of EMPACT#8482.

...read moreread less

Abstract: EMPACT™, TANDEM Computers' manufacturing information control system, is an application that uses a distributed database. The requirements of the system include supporting multiple sites, providing continuous availability to the data, and controlling updates to communal information. The approach taken to satisfy these needs involves the use of both partitioned and replicated data. Presented in this paper is a discussion of the application requirements, the design and architecture of the database, and the algorithms used for updates and inserts.

...read moreread less

Proceedings Article•DOI•

Local query translation and optimization in a distributed system

[...]

Emmanuel Onuegbe¹, Said Rahimi¹, Alan R. Hevner²•Institutions (2)

Honeywell¹, University of Maryland, College Park²

16 May 1983

TL;DR: The optimization minimizes the number of disk accesses by taking advantage of the access paths available to the CODASYL local database management systems and the relationship information of the variables used in the relational commands.

...read moreread less

Abstract: A new query translation and optimization algorithm is presented. The algorithm is being implemented as the local query translation and optimization technique of Honeywell's Distributed Database Testbed System (DDTS). The algorithm translates local queries expressed in representational schemas (relational) to their equivalent internal schemas (network). The technique is new in that it does not translate each relational command in isolation, but rather attempts to find a collection of relational commands for which an optimized sequence of CODASYL DML commands can be generated. The optimization minimizes the number of disk accesses by taking advantage of the access paths available to the CODASYL local database management systems and the relationship information of the variables used in the relational commands.

...read moreread less

Proceedings Article•

Database Partitioning in a Cluster of Processors

[...]

Domenico Saccà, Gio Wiederhold

31 Oct 1983

TL;DR: This paper develops and evaluates algorithms that perform the partitioning and allocation of the database over the processor nodes of the network in a computationally feasible manner and proposes a mixed benefit evaluation strategy.

...read moreread less

Abstract: In a distributed database system the partitioning and allocation of the database over the processor nodes of the network can be a critical aspect of the database design effort. In this paper we develop and evaluate algorithms that perform this task in a computationally feasible manner. The network we consider is characterized by a relatively high communication bandwidth, considering the processing and input output capacities in its processors. Such a balance is typical if the processors are connected via busses or local networks. The common constraint that transactions have a specific root node no longer exists, so that there are more distribution choices. However, a poor distribution leads to less efficient computation, higher costs, and higher loads in the nodes or in the communication network so that the system may not be able to handle the required set of transactions.Our approach is to first split the database into fragments which constitute appropriate units for allocation. The fragments to be allocated are selected based on maximal benefit criteria using a greedy heuristic. The assignment to processor nodes uses a first-fit algorithm. The complete algorithm, called GFF, is stated in a procedural form.The complexity of the problem and of its candidate solutions are analyzed and several interesting relationships are proven. Alternate benefit metrics are considered, since the execution cost of the allocation procedure varies by orders of magnitude with the alternatives of benefit evaluation. A mixed benefit evaluation strategy is eventually proposed.A model for evaluation is presented. Two of the strategies are experimentally evaluated, and the reported results support the discussion. The approach should be suitable for other cases where resources have to be allocated subject to resource constraints.

...read moreread less

Proceedings Article•DOI•

Database services for personal computers linked by a local area network

[...]

Andreas Diener, Richard P. Brägger, Andreas Dudler, Carl August Zehnder

07 Dec 1983

TL;DR: The concept of a federative database server is introduced and is shown to be an excellent solution in many cases and three possible architectures for such a database system are presented.

...read moreread less

Abstract: Recently, more and more personal computers are getting linked by means of a local area network. This calls for network-wide services, among others for a network database service. However, the question of an appropriate architecture for such a database system is quite open. This paper establishes a set of criteria for this evaluation and presents three possible architectures. Among them, the concept of a federative database server is introduced and is shown to be an excellent solution in many cases. Hardware dependencies are Identified and hints for optimal decisions in different hardware environments are given. Finally, the concept chosen by the authors for implementation is presented.

...read moreread less

Journal Article•DOI•

Computation & communication in R: a distributed database manager

[...]

Bruce G. Lindsay¹, Laura M. Haas¹, Paul F. Wilms¹, Robert A. Yost¹•Institutions (1)

IBM¹

10 Oct 1983

TL;DR: R* as mentioned in this paper is an experimental prototype distributed database management system, which uses virtual circuit communication links to connect the tree of processes in an R* computation to provide message ordering, flow control, and error detection reporting.

...read moreread less

Abstract: R* is an experimental prototype distributed database management system. The computation needed to perform a sequence of multisite user transactions in R* is structured as a tree of processes communicating over virtual circuit communication links. Distributed computation can be supported by providing a server process per site which performs requests on behalf of remote users. Alternatively, a new process could be created to service each incoming request. Instead of using a shared server process or using the process per request approach, R* creates a process associated with the computation of the user on the first request to the remote site. This process is incorporated into the tree of processes serving a single user and is retained for the duration of the user computation. This approach allows R* to factor some of the request execution overhead into the process creation phase, and simplifies the retention of user and transaction context at the multiple sites of the distributed computation.R* uses virtual circuit communication links connect the tree of processes in an R* computation. Virtual circuits provide message ordering, flow control, and error detection reporting. Especially important in the distributed transaction processing environment is ability of the virtual circuit facility to de- and report any process, processor, or communication failures to the end points of virtual circuit. Error detection and report-by the virtual circuit facility is used to manage the tree of processes comprising a computation and to handle correctly the resolution of distributed transactions in the presence of various kinds of failures.R* uses the communication facility in a variety of ways. Many functions use a syn-chronous, remote procedure call protocol to perform work at remote ites. Site authentication, user identification, data definition, and database catalog management are all implemented using this remote procedure activation protocol. Query planning, on the other hand, distributes query execution plans in parallel to the sites involved. Parallel plan distribution allows server sites to overlap the computation needed to validate and store query execution plans. The query execution plans often in- volve passing data streams from site, to site with each site transforming the data stream in some way. The execution of data access requests exploits virtual circuit flow controls to allow overlapped execution at data producer and data consumer sites.Finally, distributed transaction management in R* uses the virtual circuits connecting the process tree to exchange the messages of the two-phase commit protocol. However, if a failure occurs during the commit protocol, the virtual circuits and process of the original computation may be lost. When failures inter-rupt the distributed commit protocol, R* reverts ' to a datagram-oriented protocol to transfer the messages needed to resolve the outstanding transaction. R* also uses datagrams to communicate the information needed to detect multi-site deadlocks.The R* approach to distributed computation may be contrasted with datagram-based and server-oriented distributed systems. The retention of remote processes, and the virtual circuits connecting them, for the duration of the user computation improves execution performance whenever repeated accesses are made to a remote site. The retention of remote processes is also helpful for maintaining user and transaction context between requests to the remote site. The use of virtual circuits allows several concerns, such as message ordering and flow control, to be relegated to the network and virtual circuit implementation. The ability of the virtual circuit implementation to report failures is fundamental to the management of the R* distributed computation.Currently, R* is running on multiple processors and is able to perform any SQL statement on local or remote data. This includes not only data definition and catalog manipulation statements, but also n-way joins, subqueries, and data update statements. Besides the SQL language constructs, the transaction management and distributed deadlock detection protocols are implemented and running. The tree structure of the R* computation and the use of virtual circuits have proved to be quite well adapted to the problems of implementing and controlling the complex, distributed computations needed to support the execution of a distributed data-base management system.

...read moreread less

Journal Article•DOI•

Performance of recovery architectures in parallel associative database processors

[...]

Alfonso F. Cardenas¹, Farid Alavian², Algirdas Avizienis¹•Institutions (2)

University of California, Los Angeles¹, Computer Sciences Corporation²

01 Sep 1983-ACM Transactions on Database Systems

TL;DR: The types of undesirable events that occur in a database environment are classified and the necessary recovery information, with subsequent actions to recover the correct state of the database, is summarized.

...read moreread less

Abstract: The need for robust recovery facilities in modern database management systems is quite well known. Various authors have addressed recovery facilities and specific techniques, but none have delved into the problem of recovery in database machines. In this paper, the types of undesirable events that occur in a database environment are classified and the necessary recovery information, with subsequent actions to recover the correct state of the database, is summarized. A model of the “processor-per-track” class of parallel associative database processor is presented. Three different types of recovery mechanisms that may be considered for parallel associative database processors are identified. For each architecture, both the workload imposed by the recovery mechanisms on the execution of database operations (i.e., retrieve, modify, delete, and insert) and the workload involved in the recovery actions (i.e., rollback, restart, restore, and reconstruct) are analyzed. The performance of the three architectures is quantitatively compared. This comparison is made in terms of the number of extra revolutions of the database area required to process a transaction versus the number of records affected by a transaction. A variety of different design parameters of the database processor, of the database, and of a mix of transaction types (modify, insert, and delete) are considered. A large number of combinations is selected and the effects of the parameters on the extra processing time are identified.

...read moreread less

Proceedings Article•DOI•

Application of the massively parallel processor to database management systems

[...]

Edward W. Davis¹•Institutions (1)

North Carolina State University¹

16 May 1983

TL;DR: A comparative evaluation of the Goodyear massively parallel processor and an abstract conventional computer is examined by examining specific database management functions rather than an entire database management system.

...read moreread less

Abstract: The Goodyear massively parallel processor (MPP) represents a new architecture with the potential for providing improved solutions to applications benefiting from highly parallel operation. In this paper, application of the MPP to database management systems is examined. Specifically, the relational database model is considered. Database management has been selected as a candidate application of the MPP because of the positive results achieved in previous work related to parallel architectures and database systems. The relational model has been selected for its applicability to parallel processing, its mathematical foundation, and its general recognition as a model that is superior in many respects to the hierarchical and network models. The paper concentrates on a comparative evaluation of the MPP and an abstract conventional computer by examining specific database management functions rather than an entire database management system.

...read moreread less

Journal Article•DOI•

An introduction to network architectures

[...]

V. Konangi¹, C. Dhas•Institutions (1)

Cleveland State University¹

01 Oct 1983-IEEE Communications Magazine