Showing papers in &quot;ACM Transactions on Database Systems in 1992&quot;

Scheduling real-time transactions: a performance evaluation

TL;DR: ARIES as discussed by the authors is a database management system applicable not only to database management systems but also to persistent object-oriented languages, recoverable file systems and transaction-based operating systems.

...read moreread less

Abstract: DB2TM, IMS, and TandemTM systems. ARIES is applicable not only to database management systems but also to persistent object-oriented languages, recoverable file systems and transaction-based operating systems. ARIES has been implemented, to varying degrees, in IBM's OS/2TM Extended Edition Database Manager, DB2, Workstation Data Save Facility/VM, Starburst and QuickSilver, and in the University of Wisconsin's EXODUS and Gamma database machine.

...read moreread less

1,083 citations

Journal Article•DOI•

[...]

Robert Abbott, Hector Garcia-Molina¹•Institutions (1)

Stanford University¹

Semantics-based concurrency control: beyond commutativity

TL;DR: This thesis develops a new family of algorithms for scheduling real-time transactions and proposes new techniques for handling requests without deadlines and requests with deadlines simultaneously, finding that real- time disk scheduling algorithms can perform better than conventional algorithms.

...read moreread less

Abstract: This thesis has six chapters. Chapter 1 motivates the thesis by describing the characteristics of real-time database systems and the problems of scheduling transactions with deadlines. We also present a short survey of related work and discuss how this thesis has contributed to the state of the art. In Chapter 2 we develop a new family of algorithms for scheduling real-time transactions. Our algorithms have four components: a policy to manage overloads, a policy for scheduling the CPU, a policy for scheduling access to data, i.e., concurrency control and a policy for scheduling I/O requests on a disk device. In Chapter 3, our scheduling algorithms are evaluated via simulation. Our chief result is that real-time scheduling algorithms can perform significantly better than a conventional non real-time algorithm. In particular, the Least Slack (static evaluation) policy for scheduling the CPU, combined with the Wait Promote policy for concurrency control, produces the best overall performance. In Chapter 4 we develop a new set of algorithms for scheduling disk I/O requests with deadlines. Our model assumes the existence of a real-time database system which assigns deadlines to individual read and write requests. We also propose new techniques for handling requests without deadlines and requests with deadlines simultaneously. This approach greatly improves the performance of the algorithms and their ability to minimize missed deadlines. In Chapter 5 we evaluate the I/O scheduling algorithms using detailed simulation. Our chief result is that real-time disk scheduling algorithms can perform better than conventional algorithms. In particular, our algorithm FD-SCAN was found to be very effective across a wide range of experiments. Finally, in Chapter 6 we summarize our conclusions and discuss how this work has contributed to the state of the art. Also, we briefly explore some interesting new directions for continuing this research.

...read moreread less

575 citations

Journal Article•DOI•

[...]

B. R. Badrinath¹, Krithi Ramamritham²•Institutions (2)

Rutgers University¹, University of Massachusetts Amherst²

Object operations benchmark

TL;DR: To ensure the serializability of transactions, the recoverability relationship between transactions is forced to be acyclic, which can be used to decrease the delay involved in processing non-commuting operations while still avoiding cascading aborts.

...read moreread less

Abstract: The concurrency of transactions executing on atomic data types can be enhanced through the use of semantic information about operations defined on these types. Hitherto, commutativity of operations has been exploited to provide enchanced concurrency while avoiding cascading aborts. We have identified a property known as recoverability which can be used to decrease the delay involved in processing noncommuting operations while still avoiding cascading aborts. When an invoked operation is recoverable with respect to an uncommitted operation, the invoked operation can be executed by forcing a commit dependency between the invoked operation and the uncommitted operation; the transaction invoking the operation will not have to wait for the uncommitted operation to abort or commit. Further, this commit dependency only affects the order in which the operations should commit, if both commit; if either operation aborts, the other can still commit thus avoiding cascading aborts. To ensure the serializability of transactions, we force the recoverability relationship between transactions to be acyclic. Simulation studies, based on the model presented by Agrawal et al. [1], indicate that using recoverability, the turnaround time of transactions can be reduced. Further, our studies show enchancement in concurrency even when resource constraints are taken into consideration. The magnitude of enchancement is dependent on the resource contention; the lower the resource contention, the higher the improvement.

...read moreread less

212 citations

Journal Article•DOI•

[...]

R. G. G. Cattell¹, J. Skeen¹•Institutions (1)

Sun Microsystems¹

On taxonomic reasoning in conceptual design

TL;DR: This paper provides a careful specification of the OO1 benchmark, shows how it can be implemented on database systems, and presents evidence that more than an order of magnitude difference in performance can result from a DBMS implementation quite different from current products.

...read moreread less

Abstract: Performance is a major issue in the acceptance of object-oriented and relational database systems aimed at engineering applications such as computer-aided software engineering (CASE) and computer-aided design (CAD). Because traditional database systems benchmarks are inapproriate to measure performance for operations on engineering objects, we designed a new benchmark Object Operations version 1 (OO1) to focus on important characteristics of these applications. OO1 is descended from an earlier benchmark for simple database operations and is based on several years experience with that benchmark. In this paper we describe the OO1 benchmark and results we obtained running it on a variety of database systems. We provide a careful specification of the benchmark, show how it can be implemented on database systems, and present evidence that more than an order of magnitude difference in performance can result from a DBMS implementation quite different from current products; minimizing overhead per database call, offloading database server functionality to workstations, taking advantage of large main memories, and using link-based methods.

...read moreread less

181 citations

Journal Article•DOI•

[...]

Sonia Bergamaschi, Claudio Sartori

Representing extended entity-relationship structures in relational databases: a modular approach

TL;DR: The effectiveness of taxonomic reasoning techniques as an active support to knowledge acquisition and conceptual schema design is shown and an extended formalism and taxonomic inference algorithms for models giving prominence to attributes are given.

...read moreread less

Abstract: Taxonomic reasoning is a typical task performed by many AI knowledge representation systems In this paper, the effectiveness of taxonomic reasoning techniques as an active support to knowledge acquisition and conceptual schema design is shown The idea developed is that by extending conceptual models with defined concepts and giving them rigorous logic semantics, it is possible to infer isa relationships between concepts on the basis of their descriptions From a theoretical point of view, this approach makes it possible to give a formal definition for consistency and minimality of a conceptual schema From a pragmatic point of view it is possible to develop an active environment that allows automatic classification of a new concept in the right position of a given taxonomy, ensuring the consistency and minimality of a conceptual schema A formalism that includes the data semantics of models giving prominence to type constructors (E/R, TAXIS, GALILEO) and algorithms for taxonomic inferences are presented: their soundness, completeness, and tractability properties are proved Finally, an extended formalism and taxonomic inference algorithms for models giving prominence to attributes (FDM, IFO) are given

...read moreread less

130 citations

Journal Article•DOI•

[...]

Victor M. Markowitz¹, Arie Shoshani¹•Institutions (1)

Lawrence Berkeley National Laboratory¹

The generalized tree quorum protocol: an efficient approach for managing replicated data

TL;DR: This work defines criteria for both evaluating the correctness of and characterizing the relationship between alternative relational representations of EER schemas and splits the translation into four stages corresponding to the four aspects mentioned above.

...read moreread less

Abstract: A common approach to database design is to describe the structures and constraints of the database application in terms of a semantic data model, and then represent the resulting schema using the data model of a commercial database management system. Often, in practice, Extended Entity-Relationship (EER) schemas are translated into equivalent relational schemas. This translation involves different aspects: representing the EER schema using relational constructs, assigning names to relational attributes, normalization, and merging relations. Considering these aspects together, as is usually done in the design methodologies proposed in the literature, is confusing and leads to inaccurate results. We propose to treat separately these aspects and split the translation into four stages (modules) corresponding to the four aspects mentioned above. We define criteria for both evaluating the correctness of and characterizing the relationship between alternative relational representations of EER schemas.

...read moreread less

119 citations

Journal Article•DOI•

[...]

Divyakant Agrawal, A. El Abbadi

Reasoning about functional dependencies generalized for semantic data models

TL;DR: A logical tree structure is imposed on the set of copies of an object and a protocol that uses the information available in the logical structure to reduce the communication requirements for read and write operations is developed.

...read moreread less

Abstract: In this paper, we present a low-cost fault-tolerant protocol for managing replicated data. We impose a logical tree structure on the set of copies of an object and develop a protocol that uses the information available in the logical structure to reduce the communication requirements for read and write operations. The tree quorum protocol is a generalization of the static voting protocol with two degrees of freedom for choosing quorums. In general, this results in significantly lower communication costs for comparable data availability. The protocol exhibits the property of graceful degradation, i.e., communication costs for executing operations are minimal in a failure-free environment but may increase as failures occur. This approach in designing distributed systems is desirable since it provides fault-tolerance without imposing unnecessary costs on the failure-free mode of operations.

...read moreread less

114 citations

Journal Article•DOI•

[...]

Grant E. Weddell¹•Institutions (1)

University of Waterloo¹

Converting nested algebra expressions into flat algebra expressions

TL;DR: The richer expressiveness of this more general functional dependency for semantic data models that derives from their common feature in which the separate notions of domain and relation in the relational model are combined into a single notion of class is proved.

...read moreread less

Abstract: We propose a more general form of functional dependency for semantic data models that derives from their common feature in which the separate notions of domain and relation in the relational model are combined into a single notion of class. This usually results in a richer terminological component for their query languages, whereby terms may navigate through any number of properties, including none. We prove the richer expressiveness of this more general functional dependency, and exhibit a sound and complete set of inference axioms. Although the general problem of decidability of their logical implication remains open at this time, we present decision procedures for cases in which the dependencies included in a schema correspond to keys, or in which the schema itself is acyclic. The theory is then extended to include a form of conjunctive query. Of particular significance is that the query becomes an additional source of functional dependency. Finally, we outline several applications of the theory to various problems in physical design and in query optimization. The applications derive from an ability to predict when a query can have at most one solution.

...read moreread less

104 citations

Journal Article•DOI•

[...]

Jan Paredaens¹, Dirk Van Gucht²•Institutions (2)

University of Antwerp¹, Indiana University²

Concurrency control for high contention environments

TL;DR: It is shown that the flat relational algebra is rich enough to extract the same “flat information” from a flat database as the nested algebra does, which implies that recursive queries such as the transitive closure of a binary relation cannot be expressed in the nestedgebra.

...read moreread less

Abstract: Nested relations generalize ordinary flat relations by allowing tuple values to be either atomic or set valued. The nested algebra is a generalization of the flat relational algebra to manipulate nested relations. In this paper we study the expressive power of the nested algebra relative to its operation on flat relational databases. We show that the flat relational algebra is rich enough to extract the same “flat information” from a flat database as the nested algebra does. Theoretically, this result implies that recursive queries such as the transitive closure of a binary relation cannot be expressed in the nested algebra. Practically, this result is relevant to (flat) relational query optimization.

...read moreread less

96 citations

Journal Article•DOI•

[...]

Peter A. Franaszek¹, John T. Robinson¹, Alexander Thomasian¹•Institutions (1)

IBM¹

A method for automatic rule derivation to support semantic query optimization

TL;DR: A number of concurrency control concepts and transaction scheduling techniques that are applicable to high contention environments, and that do not rely on database semantics to reduce contention are considered.

...read moreread less

Abstract: Future transaction processing systems may have substantially higher levels of concurrency due to reasons which include: (1) increasing disparity between processor speeds and data access latencies, (2) large numbers of processors, and (3) distributed databases. Another influence is the trend towards longer or more complex transactions. A possible consequence is substantially more data contention, which could limit total achievable throughput. In particular, it is known that the usual locking method of concurrency control is not well suited to environments where data contention is a significant factor.Here we consider a number of concurrency control concepts and transaction scheduling techniques that are applicable to high contention environments, and that do not rely on database semantics to reduce contention. These include access invariance and its application to prefetching of data, approximations to essential blocking such as wait depth limited scheduling, and phase dependent control. The performance of various concurrency control methods based on these concepts are studied using detailed simulation models. The results indicate that the new techniques can offer substantial benefits for systems with high levels of data contention.

...read moreread less

94 citations

Journal Article•DOI•

[...]

Michael Siegel, Edward Sciore, Sharon C. Salveter

Rule-based optimization and query processing in an extensible geometric database system

TL;DR: This paper describes the architecture of a system having two interrelated components: a combined conventional/semantic query optimizer, and an automatic rule deriver, and shows how semantic query optimization is an extension of conventional optimization in this context.

...read moreread less

Abstract: The use of inference rules to support intelligent data processing is an increasingly important tool in many areas of computer science. In database systems, rules are used in semantic query optimization as a method for reducing query processing costs. The savings is dependent on the ability of experts to supply a set of useful rules and the ability of the optimizer to quickly find the appropriate transformations generated by these rules. Unfortunately, the most useful rules are not always those that would or could be specified by an expert. This paper describes the architecture of a system having two interrelated components: a combined conventional/semantic query optimizer, and an automatic rule deriver.Our automatic rule derivation method uses intermediate results from the optimization process to direct the search for learning new rules. Unlike a system employing only user-specified rules, a system with an automatic capability can derive rules that may be true only in the current state of the database and can modify the rule set to reflect changes in the database and its usage pattern.This system has been implemented as an extension of the EXODUS conventional query optimizer generator. We describe the implementation, and show how semantic query optimization is an extension of conventional optimization in this context.

...read moreread less

Journal Article•DOI•

[...]

Ludger Becker, Ralf Hartmut Güting¹•Institutions (1)

Rolf C. Hagen Group¹

Automatic deduction of temporal information

TL;DR: This paper shows in particular how the special processing techniques of a geometric database systems, such as spatial join methods and geometric index structures, can be integrated into query processing and optimization of a relational database system.

...read moreread less

Abstract: Gral is an extensible database system, based on the formal concept of a many-sorted relational algebra. Many-sorted algebra is used to define any application's query language, its query execution language, and its optimiztion rules. In this paper we describe Gral's optimization component. It provides (1) a sophisticated rule language—rules are transformations of abstract algebra expressions, (2) a general optimization framework under which more specific optimization algorithms can be implemented, and (3) several control mechanisms for the application of rules. An optimization algorithm can be specified as a series of steps. Each step is defined by its own collection of rules together with a selected control strategy.The general facilities are illustrated by the complete design of an example optimizer—in the form of a rule file—for a small nonstandard query language and an associated execution language. The query language includes selection, join, ordering, embedding derived values, aggregate functions, and several geometric operations. The example shows in particular how the special processing techniques of a geometric database systems, such as spatial join methods and geometric index structures, can be integrated into query processing and optimization of a relational database system. A similar, though larger, optimizer is fully functional within the geometric database system implemented as a Gral prototype.

...read moreread less

Journal Article•DOI•

[...]

Roberto Maiocchi¹, Barbara Pernici¹, Federico Barbic•Institutions (1)

Polytechnic University of Milan¹

Simple conditions for guaranteeing higher normal forms in relational databases

TL;DR: TSOS is presented, a system for reasoning about time that can be integrated as a time expert in environments designed for broader problem-solving domains and has the capability to reason about temporal data specified at different time granularities.

...read moreread less

Abstract: In many computer-based applications, temporal information has to be stored, retrieved, and related to other temporal information. Several time models have been proposed to manage temporal knowledge in the fields of conceptual modeling, database systems, and artificial intelligence.In this paper we present TSOS, a system for reasoning about time that can be integrated as a time expert in environments designed for broader problem-solving domains. The main intended goal of TSOS is to allow a user to infer further information on the temporal data stored in the database through a set of deduction rules handling various aspects of time. For this purpose, TSOS provides the capability of answering queries about the temporal specifications it has in its temporal database.Distinctive time-modeling features of TSOS are the introduction of temporal modalitites, i.e., the possibility of specifying if a piece of information is always true within a time interval, or if it is only sometimes true, and the capability of answering about the possibility and the necessity of the validity of some information at a given time, the association of temporal knowledge both to instances of data and to types of data, and the development of a time calculus for reasoning on temporal data. Another relevant feature of TSOS is the capability to reason about temporal data specified at different time granularities.

...read moreread less

Journal Article•DOI•

[...]

C. J. Date, Ronald Fagin¹•Institutions (1)

IBM¹

Performance evaluation of cautious waiting

TL;DR: It is shown that if a relation schema is in third normal form and every key is simple, then it is in projection-join normal form (sometimes called fifth normal form), the ultimate normal form with respect to projections and joins.

...read moreread less

Abstract: A key is simple if it consists of a single attribute. It is shown that if a relation schema is in third normal form and every key is simple, then it is in projection-join normal form (sometimes called fifth normal form), the ultimate normal form with respect to projections and joins. Furthermore, it is shown that if a relation schema is in Boyce-Codd normal form and some key is simple, then it is in fourth normal form (but not necessarily projection-join normal form). These results give the database designer simple sufficient conditions, defined in terms of functional dependencies alone, that guarantee that the schema being designed is automatically in higher normal forms.

...read moreread less

Journal Article•DOI•

[...]

Meichun Hsu, Bin Zhang

Optimal weight assignment for signature generation

TL;DR: The analytical tools developed enable us to see that the cautious waiting algorithm manages to achieve a delicate balance between restart and blocking, and therefore is superior (i.e., has higher throughput to both the no-waiting and general waiting algorithms under a wide range of system parameters).

...read moreread less

Abstract: We study a deadlock-free locking-based concurrency control algorithm, called cautious waiting, which allows for a limited form of waiting. The algorithm is very simple to implement. We present an analytical solution to its performance evaluation based on the mean-value approach proposed by Tay et al. [18]. From the modeling point of view, we are able to do away with a major assumption used in Tay's previous work, and therefore capture more accurately both the restart and the blocking rates in the system. We show that to solve for this model we only need to solve for the root of a polynomial. The analytical tools developed enable us to see that the cautious waiting algorithm manages to achieve a delicate balance between restart and blocking, and therefore is superior (i.e., has higher throughput to both the no-waiting (i.e., immediate restart) and the general waiting algorithms under a wide range of system parameters. The study substantiates the argument that balancing restart and blocking is important in locking systems.

...read moreread less

Journal Article•DOI•

[...]

Chun-Wu Roger Leng¹, Dik Lun Lee¹•Institutions (1)

Ohio State University¹

Intelligent database caching through the use of page-answers and page-traces

TL;DR: A new way of generating signatures, the fixed-weight block (FWB) method, is introduced that has a lower false-drop probability than that of the FSB method, but its storage overhead is slightly higher.

...read moreread less

Abstract: Previous work on superimposed coding has been characterized by two aspects. First, it is generally assumed that signatures are generated from logical text blocks of the same size; that is, each block contains the same number of unique terms after stopword and duplicate removal. We call this approach the fixed-size block (FSB) method, since each text block has the same size, as measured by the number of unique terms contained in it. Second, with only a few exceptions [6,7,8,9,17], most previous work has assumed that each term in the text contributes the same number of ones to the signature (i.e., the weight of the term signatures is fixed). The main objective of this paper is to derive an optimal weight assignment that assigns weights to document terms according to their occurrence and query frequencies in order to minimize the false-drop probability. The optimal scheme can account for both uniform and nonuniform occurence and query frequencies, and the signature generation method is still based on hashing rather than on table lookup. Furthermore, a new way of generating signatures, the fixed-weight block (FWB) method, is introduced. FWB controls the weight of every signature to a constant, whereas in FSB, only the expected signature weight is constant. We have shown that FWB has a lower false-drop probability than that of the FSB method, but its storage overhead is slightly higher. Other advantages of FWB are that the optimal weight assignment can be obtained analytically without making unrealistic assumptions and that the formula for computing the term signature weights is simple and efficient.

...read moreread less

Journal Article•DOI•

[...]

Nabil Kamel, Roger King

On Roth, Korth, and Silberschatz's extended algebra and calculus for nested relational databases

TL;DR: An efficient heuristic to select a near optimal set of page-answers and page-traces to populate the main memory has been developed, implemented, and tested and quantitative measurements of performance benefits are reported.

...read moreread less

Abstract: In this paper a new method to improve the utilization of main memory systems is presented. The new method is based on prestoring in main memory a number of query answers, each evaluated out of a single memory page. To this end, the ideas of page-answers and page-traces are formally described and their properties analyzed. The query model used here allows for selection, projection, join, recursive queries as well as arbitrary combinations. We also show how to apply the approach under update traffic. This concept is especially useful in managing the main memories of an important class of applications. This class includes the evaluation of triggers and alerters, performance improvement of rule-based systems, integrity constraint checking, and materialized views. These applications are characterized by the existence at compile time of a predetermined set of queries, by a slow but persistent update traffic, and by their need to repetitively reevaluate the query set. The new approach represents a new type of intelligent database caching, which contrasts with traditional caching primarily in that the cache elements are derived data and as a consequence, they overlap arbitrarily and do not have a fixed length. The contents of the main memory cache are selected based on the data distribution within the database, the set of fixed queries to preprocess, and the paging characteristics. Page-answers and page-traces are used as the smallest indivisible units in the cache. An efficient heuristic to select a near optimal set of page-answers and page-traces to populate the main memory has been developed, implemented, and tested. Finally, quantitative measurements of performance benefits are reported.

...read moreread less

Journal Article•DOI•

[...]

Abdullah Uz Tansel¹, Lucy Garnett²•Institutions (2)

Bilkent University¹, Baruch College²

Updating relational databases through weak instance interfaces

TL;DR: The issues encountered in the extended algebra and calculus languages for nested relations defined by Roth, Korth, and Silberschatz are discussed, including the issue of keying problems and the use of extended set operations.

...read moreread less

Abstract: We discuss the issues encountered in the extended algebra and calculus languages for nested relations defined by Roth, Korth, and Silberschatz.[4]. Their equivalence proof between algebra and calculus fails because of the keying problems and the use of extended set operations. Extended set operations also have unintended side effects. Furthermore, their calculus seems to allow the generation of power sets, thus making it more powerful than their algebra.

...read moreread less

Journal Article•DOI•

[...]

Paola Atzeni¹, Riccardo Torlone•Institutions (1)

Sapienza University of Rome¹

Constant-time maintainability: a generalization of independence

TL;DR: The problem of updating databases through interfaces based on the weak instance model is studied, thus extending previous proposals that considered them only from the query point of view.

...read moreread less

Abstract: The problem of updating databases through interfaces based on the weak instance model is studied, thus extending previous proposals that considered them only from the query point of view. Insertions and deletions of tuples are considered.As a preliminary tool, a lattice on states is defined, based on the information content of the various states.Potential results of an insertion are states that contain at least the information in the original state and that in the new tuple. Sometimes there is no potential result, and in the other cases there may be many of them. We argue that the insertion is deterministic if the state that contains the information common to all the potential results (the greatest lower bound, in the lattice framework) is a potential result itself. Effective characterizations for the various cases exist.A symmetric approach is followed for deletions, with fewer cases, since there are always potential results; determinism is characterized as a consequence.

...read moreread less

Journal Article•DOI•

[...]

Ke Wang¹, Marc H. Graham²•Institutions (2)

Chongqing University¹, Carnegie Mellon University²