scispace - formally typeset
Search or ask a question

Showing papers on "Tuple published in 1989"


Journal ArticleDOI
01 Jun 1989
TL;DR: The Hybrid hash-join algorithm is found to be superior except when the join attribute values of the inner relation are non-uniformly distributed and memory is limited.
Abstract: In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hash-based join methods, Simple hash represents a looping algorithm with hashing, and our last algorithm is the more traditional sort-merge. The performance of each of the algorithms with different tuple distribution policies, the addition of bit vector filters, varying amounts of main-memory for joining, and non-uniformly distributed join attribute values is studied. The Hybrid hash-join algorithm is found to be superior except when the join attribute values of the inner relation are non-uniformly distributed and memory is limited. In this case, a more conservative algorithm such as the sort-merge algorithm should be used. The Gamma database machine serves as the host for the performance comparison.

399 citations


Journal ArticleDOI
TL;DR: The class of derived relations considered in this paper is restricted to those defined by PSJ-expressions, that is, any relational algebra expressions constructed from an arbitrary number of project, select and join operations (but containing no self-joins).
Abstract: Consider a database containing not only base relations but also stored derived relations (also called materialized or concrete views). When a base relation is updated, it may also be necessary to update some of the derived relations. This paper gives sufficient and necessary conditions for detecting when an update of a base relation cannot affect a derived relation (an irrelevant update), and for detecting when a derived relation can be correctly updated using no data other than the derived relation itself and the given update operation (an autonomously computable update). The class of derived relations considered is restricted to those defined by PSJ-expressions, that is, any relational algebra expressions constructed from an arbitrary number of project, select and join operations (but containing no self-joins). The class of update operations consists of insertions, deletions, and modifications, where the set of tuples to be deleted or modified is specified by a selection condition on attributes of the relation being updated.

230 citations


Book ChapterDOI
David Gelernter1
12 Jun 1989
TL;DR: If the authors allow tuple spaces to be included among the fields of ordinary tuples, the Linda tuple-manipulation operators will allow us to operate not only on single data objects but on whole computations.
Abstract: Multiple tuple spaces have been envisioned for Linda since the system's first comprehensive description; they are intended for two purposes. First, by allowing tuples to be organized into a hierarchy of separate spaces, they should make it possible to construct large Linda programs out of modules, to realize Linda's long-standing potential to be a model for persistent storage, to enforce separation between the system and users in a Linda-based operating system, and to support abstraction. Second, if we allow tuple spaces to be included among the fields of ordinary tuples, the Linda tuple-manipulation operators will allow us to operate not only on single data objects but on whole computations.

208 citations


Journal ArticleDOI
TL;DR: It is shown that a computational model developed in the framework of resolution provides a very adequate tool to study and develop query answering procedures for deductive databases, as well as for logic programs, and that the framework provided by QoSaQ is powerful enough to account for the best-known recursive query evaluation methods.

156 citations


Book
12 Jun 1989
TL;DR: The advanced information management prototype Verso, a database machine based on nested relations, and an approach to manage large inheritance networks within a DBS supporting nested relations are presented.
Abstract: The advanced information management prototype.- Verso: A database machine based on nested relations.- The two roles of nested relations in the DASDBS project.- A storage structure for Nested Relational Databases.- Four views of complex objects: A sophisticate's introduction.- An introduction to the completeness of languages for complex objects and nested relations.- On the uniqueness of nested relations.- An introduction to the Nested Sequences of Tuples data model and algebra.- Recursively defined complex objects.- Query languages for Nested Relational Databases.- Nested relations and recursive queries.- Realization of nested relation interfaces for relational and network databases.- An approach to manage large inheritance networks within a DBS supporting nested relations.- On the normalization in Nested Relational Databases.- Complex objects modeling: An entity-relationship approach.- A data model for complex objects based on a semantic database model and nested relations.- ?-Acyclic database schemes and nested relations.

128 citations


Journal ArticleDOI
01 Jun 1989
TL;DR: This paper shows that queries are more efficient and succinct when expressed in the recursive algebra than in languages that require restructuring in order to access subrelations of relations and that most of the query optimization techniques that have been developed for the relational algebra can be easily extended for the recursivegebra.
Abstract: The nested relational model provides a better way to represent complex objects than the (flat) relational model, by allowing relations to have relation-valued attributes. A recursive algebra for nested relations that allows tuples at all levels of nesting in a nested relation to be accessed and modified without any special navigational operators and without having to flatten the nested relation has been developed. In this algebra, the operators of the nested relational algebra are extended with recursive definitions so that they can be applied not only to relations but also to subrelations of a relation. In this paper, we show that queries are more efficient and succinct when expressed in the recursive algebra than in languages that require restructuring in order to access subrelations of relations. We also show that most of the query optimization techniques that have been developed for the relational algebra can be easily extended for the recursive algebra and that queries are more easily optimizable when expressed in the recursive algebra than when they are expressed in languages like the non-recursive algebra.

119 citations


Journal ArticleDOI
TL;DR: It is shown that even for a single remote view, there are many instances where the update procedure performs better (with respect to total I/O and communication costs) than a base table approach.
Abstract: The problem of updating materialized views in distributed database systems is discussed. An architecture and detailed procedures for updating a collection of remote views with arbitrary refresh times by using a single differential file are described. The efficiency of the update procedure is enhanced by adopting s multiquery optimization approach and by introducing a powerful prescreening procedure to eliminate differential tuples. It is shown that even for a single remote view, there are many instances where the update procedure performs better (with respect to total I/O and communication costs) than a base table approach. >

109 citations


Journal ArticleDOI
TL;DR: This is the first model based on a many-sorted instead of a one-sorts algebra, which means that atomic data values as well as nested structures are objects of the algebra, and can be used directly as a rich query language for office documents with precisely defined semantics.
Abstract: We describe a data model for structured office information objects, which we generically call “documents,” and a practically useful algebraic language for the retrieval and manipulation of such objects. Documents are viewed as hierarchical structures; their layout (presentation) aspect is to be treated separately. The syntax and semantics of the language are defined precisely in terms of the formal model, an extended relational algebra.The proposed approach has several new features, some of which are particularly useful for the management of office information. The data model is based on nested sequences of tuples rather than nested relations. Therefore, sorting and sequence operations and the explicit handling of duplicates can be described by the model. Furthermore, this is the first model based on a many-sorted instead of a one-sorted algebra, which means that atomic data values as well as nested structures are objects of the algebra. As a consequence, arithmetic operations, aggregate functions, and so forth can be treated inside the model and need not be introduced as query language extensions to the model. Many-sorted algebra also allows arbitrary algebra expressions (with Boolean result) to be admitted as selection or join conditions and the results of arbitrary expressions to be embedded into tuples. In contrast to other formal models, this algebra can be used directly as a rich query language for office documents with precisely defined semantics.

105 citations


01 Jan 1989
TL;DR: A language for databases with sets, tuples, lists, object identity and structural inheritance, which is logic-based with a fixpoint semantics and the introduction of explicit control is proposed.
Abstract: A language for databases with sets, tuples, lists, object identity and structural inheritance is proposed. The core language is logic-based with a fixpoint semantics. Methods with overloading and methods evaluated externally providing extensibility of the language are considered. Other important issues such as updates and the introduction of explicit control are discussed.

98 citations


Proceedings ArticleDOI
29 Mar 1989
TL;DR: The type model presents two notions: that of classes whose instances are objects with identity and that of types whose instance are complex values, which are mixed in that an object is modeled as a pair containing an identifier and a value, and avalue is a complex structure which contains objects and values.
Abstract: In this paper, we present a type model for object-oriented databases. Most object-oriented databases only provide users with flat objects whose structure is a record of other objects. In order to have a powerful expression power, an object-oriented database should not only provide objects but also complex values recursively built using the set, tuple and disjunctive constructors. Our type model presents two notions: that of classes whose instances are objects with identity and that of types whose instances are complex values. The two notions are mixed in that an object is modeled as a pair containing an identifier and a value, and a value is a complex structure which contains objects and values. We define in this context the notions of subtyping and provide a set inclusion semantics for subtyping.

62 citations


Proceedings ArticleDOI
06 Feb 1989
TL;DR: It is shown that even for a single remote view there are many instances where the presented update procedure performs better (with respect to total I/O and communication costs) than existing methods.
Abstract: The problem of updating materialized views in distributed database systems is discussed. An architecture and detailed procedures to update a collection of remote views with arbitrary refresh times by using a single differential file are prescribed. The efficiency of the update procedure is enhanced by adopting a multiple-query optimization approach and by introducing a powerful prescreening procedure to eliminate differential tuples. It is shown that even for a single remote view there are many instances where the presented update procedure performs better (with respect to total I/O and communication costs) than existing methods. >

Journal ArticleDOI
01 Jun 1989
TL;DR: A method to finitely represent infinite least fixpoints and infinite query answers as relational specifications is presented, applicable to every domain-independent set of functional rules.
Abstract: We investigate here functional deductive databases: an extension of DATALOG capable of representing infinite phenomena. Rules in functional deductive databases are Horn and predicates can have arbitrary unary and limited k-ary function symbols in one fixed position. This class is known to be decidable. However, least fixpoints of functional rules may be infinite. We present here a method to finitely represent infinite least fixpoints and infinite query answers as relational specifications. Relational specifications consist of a finite set of tuples and of a finitely specified congruence relation. Our method is applicable to every domain-independent set of functional rules.

Journal ArticleDOI
TL;DR: The proposed summary data model, enforcing the disjointness constraint, alleviates the intractable problem without loss of information and provides for efficient operations, including summary data search, derivation, insertion, and deletion.
Abstract: A data model and an access method for summary data management are presented. Summary data, represented as a trinary tuple (statistical function, category, summary), are metaknowledge summarized by a statistical function of a category of individual information typically stored in a conventional database. For instance, (average-income, female engineer with 10 years' experience and master's degree, $45000) is a summary datum. The computational complexity of the derivability problem has been found intractable in general, and the proposed summary data model, enforcing the disjointness constraint, alleviates the intractable problem without loss of information. In order to store, manage, and access summary data, a multidimensional access method called summary data (SD) tree is proposed. By preserving the category hierarchy, the SD tree provides for efficient operations, including summary data search, derivation, insertion, and deletion. >

01 Jan 1989
TL;DR: A new implementation of Linda based on a two-layer model is described, which reports on the results of some experiments in which a group of VAXes operating in parallel were able to attain performance competitive with super-computers on interesting problems.
Abstract: Linda is a programming language intended for explicitly parallel programming. It is based on a unique communication mechanism, a high-level shared memory known as tuple space. In this dissertation, we describe a new implementation of Linda based on a two-layer model. One layer is centered on multiple processes, possibly running on multiple processors, sharing a common memory. The second layer coordinates a number of these multiprocessors connected by a relatively slow, somewhat unreliable link. This two layer model corresponds to an increasingly common computing environment: A local area network with workstations and "compute servers." Our implementation is designed to allow sharing of both the computers and the LAN with other user processes, running Linda or anything else. It runs on VAXes connected by an Ethernet and running the VMS operating system. The system we developed supports a new dialect of Linda, L scINDA-C, which extends the C language's type system to allow the programmer to make effective use of tuple space's dependency on types. In this way, the programmer can provide in a natural way information that allows the compiler to generate efficient code. The VAX L scINDA-C compiler also supports separate compilation, maintaining a cross-module set of type definitions so that separately compiled modules can communicate. We report on the results of some experiments in which a group of VAXes operating in parallel were able to attain performance competitive with super-computers on interesting problems. The process of designing and implementing VAX L scINDA-C required us to examine a variety of problems in such areas as optimization, associative data structures, and network algorithms. We present novel approaches to some of these problems.

Proceedings Article
20 Aug 1989
TL;DR: This paper considers the composition of tuples from two relations in order to derive additional tuples of one of these relations and shows how a set of underlying attributes, independently specified for each relation, is sufficient for determining plausible composition.
Abstract: This paper considers the composition of tuples from two relations in order to derive additional tuples of one of these relations. Our purpose is to determine when the composition is plausible and for which relation the new tuples are derived. We first present a formal definition of composition and our extension to it. We next define conditions on the domains and ranges of the relations that are necessary for extended composition to occur. We then show how a set of underlying attributes, independently specified for each relation, is sufficient for determining plausible composition, when the primitives are combined according to an algebra. Finally, we apply our method for extended composition to a representative group of semantic relations and evaluate the results.

Journal ArticleDOI
01 Jun 1989
TL;DR: Two capabilities to aid an interactive database user who is neither an application specialist nor a DBMS expert are described which extend the operations of a relational DBMS.
Abstract: Interactive use of relational database management systems (DBMS) requires a user to be knowledgeable about the semantics of the application represented in the database. In many cases, however, users are not trained in the application field and are not DBMS experts. Two categories of functionality are problematic for such users: (1) updating a database without violating integrity constraints imposed by the domain and (2) using join operations to retrieve data from more than one relation. We have been conducting research to help an uninformed or casual user interact with a relational DBMS.This paper describes two capabilities to aid an interactive database user who is neither an application specialist nor a DBMS expert. We have developed deferred Referential Integrity Checking (RIC) and Intelligent Join (IJ) which extend the operations of a relational DBMS. These facilities are made possible by explicit representation of database semantics combined with a relational schema. Deferred RIC is a static validation procedure that checks uniqueness of tuples, non-null keys, uniqueness of keys, and inclusion dependencies. IJ allows a user to identify only the “target” data which is to be retrieved without the need to additionally specify “join clauses”. In this paper we present the motivation for these facilities, describe the features of each, and present examples of their use.

Journal ArticleDOI
TL;DR: The author presents the results of simulations that compare the performance of this approach with the simple join technique and the proposed approach is seen to provide better performance for an average domain value size of greater than between 2 and 4 bytes.
Abstract: The use of a composite index known as the B/sub c/-tree is presented; it is based on the concept of the B/sup +/-tree and serves the dual purpose of an attribute and join index and indirectly implements the link sets. The leaf node of the B/sub c/-tree incorporates in each leaf node a reference to all tuples in the database that share common data values of a shared domain. In addition to improving the performance of the join and selection operations, the composite index facilitate the enforcement of structural integrity constraints. The author also presents the results of simulations that compare the performance of this approach with the simple join technique. The proposed approach, in the case of the simulated database, is seen to provide better performance for an average domain value size of greater than between 2 and 4 bytes. >

Journal ArticleDOI
TL;DR: The author examines join processing when the access paths available are nonclustered indexes on the joining attribute(s) for both relations involved in the join and gives performance comparisons of these heuristics and another method that recently appeared in the literature.
Abstract: The author examines join processing when the access paths available are nonclustered indexes on the joining attribute(s) for both relations involved in the join. He uses a bipartite graph model to represent the pages from the two relations that contain tuples to be joined. The minimization of the number of page accesses needed to compute a join in the author's database environment is explored from two perspectives. The first is to reduce the maximum buffer size so that no page is accessed more than once, and the second is to reduce the number of page accesses for a fixed buffer size. The author has developed heuristics for these problems. He gives performance comparisons of these heuristics and another method that recently appeared in the literature. Results show that one particular heuristic performs very well for addressing the problem from either perspective. >

Proceedings ArticleDOI
06 Feb 1989
TL;DR: A general report is presented on an approach problem of privacy-oriented information systems, based on extensive research experiences in specifying the structure of such a system, including the underlying data model and the privacy policy, as well as on the insight gained from a prototype implementation of selected parts of the specification.
Abstract: A general report is presented on an approach problem of privacy-oriented information systems. The report is based on extensive research experiences in specifying the structure of such a system, including the underlying data model and the privacy policy, as well as on the insight gained from a prototype implementation of selected parts of the specification. The system is called DORIS (datenschutz-orientiertes informations system). While the model is basically object-oriented, it is possible conveniently to describe an application by non-first-normal-form tuples and relations, and the data-manipulation language is high-level and relational. An expression is evaluated in three stages: navigation in the set of surrogates of persons, asking for knowledge, and finally normalization, prime value processing and output preparation. A prototype implementation of selected parts of the model is based on a kernel concept. >

Proceedings ArticleDOI
23 Oct 1989
TL;DR: Introduces Match Box, an incremental matching algorithm for determining the tuple instantiations of forward-chaining production rules that can perform a rule's computationally intensive incremental join testing in constant time.
Abstract: Introduces Match Box, an incremental matching algorithm for determining the tuple instantiations of forward-chaining production rules. Match Box is rooted in the mathematical interconnections between tuple and binding spaces, a framework also applicable to other pattern matching algorithms. The idea is to precompare a rule's binding space and then have each binding independently monitor working memory for the incremental formation of tuple instantiations. A key feature of Match Box is that, on a massively parallel architecture, it can perform a rule's computationally intensive incremental join testing in constant time. It also finds application on conventional serial processors. >

Proceedings ArticleDOI
Charles Elkan1
29 Mar 1989
TL;DR: An algorithm is presented that decides whether two conjunctive query expressions always describe disjoint sets of tuples, and it uses tableaux that are capable of representing all six comparison operators.
Abstract: This paper presents an algorithm that decides whether two conjunctive query expressions always describe disjoint sets of tuples. The decision procedure solves an open problem identified by Blakeley, Coburn, and Larson: how to check whether an explicitly stored view relation must be recomputed after an update, taking into account functional dependencies. For nonconjunctive queries, the disjointness problem is NP-hard. For conjunctive queries, the time complexity of the algorithm given cannot be improved unless the reachability problem for directed graphs can be solved in sublinear time. The algorithm is novel in that it combines separate decision procedures for the theory of functional dependencies and for the theory of dense orders. Also, it uses tableaux that are capable of representing all six comparison operators , and ≠.

Journal ArticleDOI
Y.-E.C. Cheng1, S.-Y. Lu1
TL;DR: The application of this scheme is illustrated by two problems in seismic horizon detection; seismic skeletonization and loop tying to demonstrate how to cast an application problem into the formulism of the scheme.
Abstract: For a given set of n tuples, the binary consistency checking scheme generates a subset wherein no two elements intersect. The application of this scheme is illustrated by two problems in seismic horizon detection; seismic skeletonization and loop tying. After a brief introduction to seismic interpretation, these two examples are used to demonstrate how to cast an application problem into the formulism of the scheme. A comparison of this scheme to the dynamic programming approach to string matching due to S.Y. Lu (1982) is included. >

Book ChapterDOI
21 Jul 1989
TL;DR: This paper examines the problem of detecting and deleting duplicates within the extended NF2 data model and a new method, based on sorting complex objects, is proposed, which is both time- and space-efficient.
Abstract: A current research topic in the area of relational databases is the design of systems based on the Non First Normal Form (NF2) data model. One particular development, the so-called extended NF2 data model, even permits structured values like lists and tuples to be included as attributes in relations. It is thus well suited to represent complex objects for non-standard database applications. A DBMS which uses this model, called the Advanced Information Management Prototype, is currently being implemented at the IBM Heidelberg Scientific Center. In this paper we examine the problem of detecting and deleting duplicates within this data model. Several alternative approaches are evaluated and a new method, based on sorting complex objects, is proposed, which is both time- and space-efficient.

Proceedings ArticleDOI
29 Mar 1989
TL;DR: It is argued that the insertion is deterministic if the state that contains the information common to all the potential results is itself a potential result, the greatest lower bound, in the lattice framework.
Abstract: Database updates have recently received much more attention than in the past. In this trend, a solid foundation is provided to the problem of updating databases through interfaces based on the weak instance model. Insertions and deletions of tuples are considered.As a preliminary tool, a lattice on states is defined, based on the information content of the various states.Potential results of an insertion are states that contain at least the information in the original state and that in the new tuple. Sometimes there is no potential result, and in the other cases there may be many of them. We argue that the insertion is deterministic if the state that contains the information common to all the potential results (the greatest lower bound, in the lattice framework) is itself a potential result. Effective characterizations for the various cases exist. A symmetric approach is followed for deletions, with fewer cases, since there are always potential results; determinism is characterized consequently.

Journal ArticleDOI
TL;DR: An encoding method which allows us to cope with one-to-many hierarchical relationships in a very different way, not to store the key of the parent tuple but create a code for the path of all the ancestor tuples.

Journal ArticleDOI
TL;DR: This paper introduces a new algebraic operation, called grouped generalized division (GGD), which overcomes shortcomings of relational algebra and shows how the GGD operation can be expressed in terms of the other more “primitive” algebraic operations.

DOI
01 Jan 1989
TL;DR: Match Box is a new increments matching algorithm for determining the tuple instantiations of forward-chaining production rules that can perform a rule's computationally intensive incremental join testing in constant time and finds application on conventional serial processors.
Abstract: "We introduce Match Box, a new incremental matching algorithm for determining the tuple instantiations of forward-chaining production rules. Match Box is rooted in the mathematical interconnections between tuple and binding spaces, a framework also applicable to other matching algorithms. The idea is to precompute a rule's binding space, and then have each binding independently monitor working memory for the incremental formation of tuple instantiations. A key feature of Match Box is that on a massively parallel architecture, it can perform a rule's computationally intensive incremental joint testing in constant time. It also finds application on conventional serial processors."

Proceedings ArticleDOI
20 Sep 1989
TL;DR: An adaptive approach which utilizes the information embedded in indexes to identify the tuples satisfying a given predicate or having a match in a join operation is proposed, so that the overhead of the approach is minimized.
Abstract: An adaptive approach which utilizes the information embedded in indexes to identify the tuples satisfying a given predicate or having a match in a join operation is proposed. An access path (index or table scan) and a join method (index join, nested loop, sort-merge) are chosen to construct the results adaptively. This leads to the optimal evaluation of queries. With an efficient implementation, the adaptive decision process becomes a part of a query evaluation procedure, so that the overhead of the approach is minimized. >

Proceedings ArticleDOI
06 Feb 1989
TL;DR: The proposed access method, called the summary data tree, or SD-tree, which handles an orthogonal category as a hyperrectangle, realizes the proposed summary data model and provides for efficient operations including summary data search, derivation, and insertion on the stored summary data.
Abstract: A data model and an access method for summary data management are proposed. Summary data, represented as a trinary tuple , consist of metaknowledge summarized by a statistical function of a category of individual information typically stored in a conventional database. The concept of category (type or class) and the additivity property of statistical functions form a basis for the model that allows for the derivation of summary data. The complexity of deriving summary data has been found computationally intractable in general, and the proposed summary data model, with disjointness constraint, solves the problem without the loss of information. The proposed access method, called the summary data tree, or SD-tree, which handles an orthogonal category as a hyperrectangle, realizes the proposed summary data model. The structure of the SD-tree provides for efficient operations including summary data search, derivation, and insertion on the stored summary data. >

01 Jan 1989
TL;DR: This paper presents a a rule-based query language for a structurally object-oriented database model that is based on the new notion of “access-predicates”, a construct for homogeneous access to complex object types and is-a-relationships.
Abstract: This paper presents a a rule-based query language for a structurally object-oriented database model. The database model includes concepts such as object-types, set and tuple constructors, and is-a-relationships. These concepts are equivalently represented by some kind of nested relations and additional surrogate attributes. The query language is based on the new notion of “access-predicates”, a construct for homogeneous access to complex object types and is-a-relationships. We propose three different approaches to the evaluation of these complex rules: the transformation into flat rules (for classical relations) and further to relational algebra, the evaluation by a nested relational algebra and an object algebra.