scispace - formally typeset
Search or ask a question

Showing papers by "Philip A. Bernstein published in 1981"


Journal ArticleDOI
TL;DR: A survey of concurrency control methods for distributed database concurrency can be found in this paper, where the authors decompose the problem into two major subproblems, read-write and write-write synchronization, and describe a series of synchromzation techniques for solving each subproblem.
Abstract: In this paper we survey, consolidate, and present the state of the art in distributed database concurrency control. The heart of our analysts is a decomposition of the concurrency control problem into two major subproblems: read-write and write-write synchronization. We describe a series of synchromzation techniques for solving each subproblem and show how to combine these techniques into algorithms for solving the entire concurrency control problem. Such algorithms are called "concurrency control methods." We describe 48 principal methods, including all practical algorithms that have appeared m the literature plus several new ones. We concentrate on the structure and correctness of concurrency control algorithms. Issues of performance are given only secondary treatment.

1,124 citations


Journal ArticleDOI
TL;DR: The semijoin operator is defined, why Semijoin is an effective reduction operator is explained, and an algorithm is presented that constructs a cost-effective program of semijoins, given an envelope and a database.
Abstract: This paper describes the techniques used to optimize relational queries in the SDD-1 distributed database system. Queries are submitted to SDD-1 in a high-level procedural language called Datalanguage. Optimization begins by translating each Datalanguage query into a relational calculus form called an envelope, which is essentially an aggregate-free QUEL query. This paper is primarily concerned with the optimization of envelopes.Envelopes are processed in two phases. The first phase executes relational operations at various sites of the distributed database in order to delimit a subset of the database that contains all data relevant to the envelope. This subset is called a reduction of the database. The second phase transmits the reduction to one designated site, and the query is executed locally at that site.The critical optimization problem is to perform the reduction phase efficiently. Success depends on designing a good repertoire of operators to use during this phase, and an effective algorithm for deciding which of these operators to use in processing a given envelope against a given database. The principal reduction operator that we employ is called a semijoin. In this paper we define the semijoin operator, explain why semijoin is an effective reduction operator, and present an algorithm that constructs a cost-effective program of semijoins, given an envelope and a database.

499 citations


Journal ArticleDOI
TL;DR: The exact class of relational queries that can be solved using semi-joins is shown and it is shown that queries outside of this class may not even be partially solvable using "short" semi-join programs.
Abstract: The semi-join is a relational algebraic operation that selects a set of tuples in one relation that match one or more tuples of another relation on the joining domains. Semi-joins have been used as a basic ingredient in query processing strategies for a number of hardware and software database systems. However, not all queries can be solved entirely using semi-joins. In this paper the exact class of relational queries that can be solved using semi-joins is shown. It is also shown that queries outside of this class may not even be partially solvable using "short" semi-join programs. In addition, a linear-time membership test for this class is presented.

468 citations


Journal ArticleDOI
TL;DR: This paper characterizes the queries for which full reducer exist and presents an efficient algorithm for constructing full reducers where they do exist and considers “natural” semijoin operator, which is used in the SDD-1 distributed database system.
Abstract: A semijoin is a relational operator that is used to reduce the cost of processing queries in the SDD-1 distributed database system, the RAP database machine, and similar systems. Semijoin is used in these systems as part of a query pre-processing phase; its function is to “reduce” the database by delimiting those portions of the database that contain data relevant to the query. For some queries, there exist sequences of semijoins that “fully reduce” the database; those sequences delimit the exact portions of the database needed to answer the query in the sense that if any less data were delimited then the query would produce a different answer. Such sequences are called full reducers.This paper characterizes the queries for which full reducers exist and presents an efficient algorithm for constructing full reducers where they do exist.This paper extends the results of Bernstein and Chiu [J. Assoc. Comput. Mach., 28 (1981), pp. 25–40] by considering a more powerful semijoin operator. We consider “natural” ...

240 citations


Proceedings ArticleDOI
04 May 1981
TL;DR: The basic architecture of Multibase is described and some of the avenues to be taken in subsequent research are identified, including developing appropriate language constructs for accessing and integrating heterogeneous databases.
Abstract: Multibase is a software system for integrating access to preexisting, heterogeneous, distributed databases. The system suppresses differences of DBMS, language, and data models among the databases and provides users with a unified global schema and a single high-level query language. Autonomy for updating is retained with the local databases. The architecture of Multibase does not require any changes to local databases or DBMSs. There are three principal research goals of the project. The first goal is to develop appropriate language constructs for accessing and integrating heterogeneous databases. The second goal is to discover effective global and local optimization techniques. The final goal is to design methods for handling incompatible data representations and inconsistent data. Currently the project is in the first year of a planned three year effort. This paper describes the basic architecture of Multibase and identifies some of the avenues to be taken in subsequent research.

215 citations


Journal ArticleDOI
TL;DR: The central notion of the paper is that of a general purpose scheduler, a database system scheduler that is blind to the semantics of transactions and integrity assertions, which establishes a tight connection between database system correctness and scheduler behavior.
Abstract: A family of simple models for database systems is defined, where a system is composed of a scheduler, a data manager and several user transactions. The basic correctness criterion for such systems is taken to be consistency preservation. The central notion of the paper is that of a general purpose scheduler, a database system scheduler that is blind to the semantics of transactions and integrity assertions. Consistency preservation of a database system is shown to be precisely equivalent to a restriction on the output of a general purpose scheduler GPS, called weak serializability. That is, any database system using GPS will preserve consistency iff the output of GPS is always weakly serializable. This establishes a tight connection between database system correctness and scheduler behavior. Also, aspects of restart facilities and predeclared data accesses are discussed. Finally, several examples of schedulers correct with respect to weak serializability are presented.

31 citations



Journal ArticleDOI
TL;DR: This paper considers a class of queries called natural inequality queries (NI queries), and characterizes a subclass for which full reducers exist, and presents an efficient algorithm that decides whether an NI query lies within this subclass, and constructs a full reducer for the query.

15 citations


Journal ArticleDOI
01 Jan 1981
TL;DR: This paper disproves several results pertaining to database concurrency control and demonstrates that the notion of "weak consistency" introduced in [8] admits database states that are strictly inconsistent.
Abstract: This paper disproves several results pertaining to database concurrency control that are claimed in [8]. The results we disprove are•theorems 3.1, 3.2, 3.6 -- which claim a polynomial time algorithm for testing whether transaction schedules are serializable, and•theorems 4.2 and 4.7 -- which claim a necessary and sufficient mechanism for preserving the "weak consistency" of databases.In addition, we demonstrate that the notion of "weak consistency" introduced in [8] admits database states that are strictly inconsistent.

3 citations