scispace - formally typeset
Journal ArticleDOI

Containment of conjunctive queries: beyond relations as sets

Reads0
Chats0
TLDR
This paper studies conjunctive queries over databases in which each tuple has an associated label, and demonstrates the fundamental difference between viewing relations as sets and as multmets, and motivates a closer examination of relatlons as multisets, given them importance in SQL.
Abstract
Conjwzctiue queries are queries over a relational database and are at the core of relational query languages such as SQL. Testing for containment (and equivalence) of such queries arises as part of many advanced features of query optimization, for example, using rnateriahzed views, processing correlated nested queries, semantic query optimization, and global query optimization. Earlier formal work on the topic has examined conjunctive queries over sets of tuples, where each query can be viewed as a function from sets to sets. Containment (and equivalence) of conjunctive queries has been naturally defined based on set mcluslon and has been shown to be an NP-complete problem. Even in SQL, however, queries over multzsets of tuples may be posed. In fact, relations are treated as multisets by default, with duplicates being ehmmated only after explicit requests Thus, in order to reason about containment/equivalence of a large class of SQL queries, it is necessary to consider a generalization of conjunctive queries, in which relations are interpreted as multmets of tuples: The view of a relation as a set of tuples must be generahzed. In this paper we study conjunctive queries over databases in which each tuple has an associated label. This generalized notion of a database allows us to consider relations that are mcsltzsets and relatlons that are fuzzy sets. As a special case, we can also model traditional set-relatlons by making the label associated with a tuple be either “true” (meaning that the tuple is in the relation) or “false” (meaning that the tuple is not in the relation). In order to keep our results general, we consider a variety of label systems, where each label system is essentially a set of conditions on the labels that can be associated with tuples. Once a result is established for a label system, it holds for all mterpretations of relatlons that satisfy these conditions. For example, we present a necessary and sufficient condition for containment of conjunctive queries for label systems of a type that abstracts both the traditional set-relations and fuzzy sets. We also present a different necessary and sufficient condition for containment of a restricted class of conjunctive queries for a label system that abstracts relations as multlsets. Finally, we show that containment of unions of conjunctwe queries is decidable for label systems of the first type and undecidable for label systems of the second type This result underscores the fundamental difference between viewing relations as sets and as multmets, and motivates a closer examination of relatlons as multisets, given them importance in SQL.

read more

Citations
More filters
Book

Large Networks and Graph Limits

TL;DR: Laszlo Lovasz has written an admirable treatise on the exciting new theory of graph limits and graph homomorphisms, an area of great importance in the study of large networks.
Proceedings ArticleDOI

Provenance semirings

TL;DR: It is shown that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why-provenance are particular cases of the same general algorithms involving semirings, and suggests a comprehensive provenance representation that usesSemirings of polynomials.
Proceedings ArticleDOI

On the decidability of query containment under constraints

TL;DR: This work addresses query containment under constraints within a setting where constraints are specified in the form of special inclusion dependencies over complex expressions, built by using intersection and difference of relations, special forms of quantification, regular expressions over binary relations, and cardinality constraints.
BookDOI

Managing and Mining Uncertain Data

TL;DR: Managing and Mining Uncertain Data, a survey with chapters by a variety of well known researchers in the datamining field, presents the most recent models, algorithms, and applications in the uncertain data mining field in a structured and concise way.
Journal Article

Models for Incomplete and Probabilistic Information.

TL;DR: In this article, the expressive power of probabilistic c-tables over infinite domains and algebraic completeness and closure under query languages of general Probabilistic database models are discussed.
References
More filters
Proceedings ArticleDOI

Optimal implementation of conjunctive queries in relational data bases

TL;DR: It is shown that while answering conjunctive queries is NP complete (general queries are PSPACE complete), one can find an implementation that is within a constant of optimal.
Journal ArticleDOI

Efficiently updating materialized views

TL;DR: This work proposes a method in which all database updates to base relations are first filtered to remove from consideration those that cannot possibly affect the view.
Book

Optimizing queries with materialized views

TL;DR: In this article, a simple generalization of the traditional query optimization algorithm is proposed to optimize queries in the presence of materialized views. But, the optimization problem is not addressed in this paper.
Journal ArticleDOI

Equivalences Among Relational Expressions with the Union and Difference Operators

TL;DR: It is shown that containment of tableaux is a necessary step in testing equivalence of queries with union and difference, and the containment problem is shown to be NP-complete even for tableaux that correspond to expressions with only one project and several join operators.
Proceedings ArticleDOI

Optimizing queries with materialized views

TL;DR: This paper analyzes the optimization question and provides a comprehensive and efficient solution and has the desirable property that it is a simple generalization of the traditional query optimization algorithm.