scispace - formally typeset
Search or ask a question
Posted Content

The View Update Problem Revisited

TL;DR: This paper characterise when a view mapping is invertible, establishing that this is the case precisely when each database symbol has an exact rewriting in terms of the view symbols under the given constraints, and provides a general effective criterion to understand whether the changes introduced by a view update can be propagated to the underlying database relations in a unique and unambiguous way.
Abstract: In this paper, we revisit the view update problem in a relational setting and propose a framework based on the notion of determinacy under constraints. Within such a framework, we characterise when a view mapping is invertible, establishing that this is the case precisely when each database symbol has an exact rewriting in terms of the view symbols under the given constraints, and we provide a general effective criterion to understand whether the changes introduced by a view update can be propagated to the underlying database relations in a unique and unambiguous way. Afterwards, we show how determinacy under constraints can be checked, and rewritings effectively found, in three different relevant scenarios in the absence of view constraints. First, we settle the long-standing open issue of how to solve the view update problem in a multi-relational database with views that are projections of joins of relations, and we do so in a more general setting where views are defined by arbitrary conjunctive queries and database constraints are stratified embedded dependencies. Next, we study a setting based on horizontal decompositions of a single database relation, where views are defined by selections on possibly interpreted attributes (e.g., arithmetic comparisons) in the presence of domain constraints over the database schema. Lastly, we look into another multi-relational database setting, where views are defined in an expressive "Type" Relational Algebra based on the n-ary Description Logic DLR and database constraints are inclusions of expressions in that algebra.
Citations
More filters
Proceedings Article
01 Jan 2015
TL;DR: A novel benchmark for OBDA systems based on real data coming from the oil industry: the Norwegian Petroleum Directorate (NPD) FactPages is proposed, with novel techniques to generate, from the NPD data, datasets of increasing size, taking into account the requirements dictated by the OBDA setting.
Abstract: In the last decades we moved from a world in which an enterprise had one central database—rather small for todays’ standards—to a world in which many different—and big—databases must interact and operate, providing the user an integrated and understandable view of the data. Ontology-Based Data Access (OBDA) is becoming a popular approach to cope with this new scenario. OBDA separates the user from the data sources by means of a conceptual view of the data (ontology) that provides clients with a convenient query vocabulary. The ontology is connected to the data sources through a declarative specification given in terms of mappings. Although prototype OBDA systems providing the ability to answer SPARQL queries over the ontology are available, a significant challenge remains when it comes to use these systems in industrial environments: performance. To properly evaluate OBDA systems, benchmarks tailored towards the requirements in this setting are needed. In this work, we propose a novel benchmark for OBDA systems based on real data coming from the oil industry: the Norwegian Petroleum Directorate (NPD) FactPages. Our benchmark comes with novel techniques to generate, from the NPD data, datasets of increasing size, taking into account the requirements dictated by the OBDA setting. We validate our benchmark on significant OBDA systems, showing that it is more adequate than previous benchmarks not tailored for OBDA.

62 citations

Proceedings ArticleDOI
06 Jul 2015
TL;DR: This work solves a well known, long-standing open problem in relational databases theory, showing that the conjunctive query determinacy problem (in its "unrestricted" version) is undecidable.
Abstract: We solve a well known, long-standing open problem in relational databases theory, showing that the conjunctive query determinacy problem (in its "unrestricted" version) is undecidable.

24 citations

Proceedings ArticleDOI
15 Jun 2016
TL;DR: In this paper, it was shown that the CONJUNICYCLICQ Finite Determinacy Problem is undecidable, and that the set Q of CQs does not determine CQQ0 but finitely determines it.
Abstract: We solve a well known and long-standing open problem in database theory, proving that Conjunctive Query Finite Determinacy Problem is undecidable. The technique we use builds on the top of the Red Spider method invented in our paper [GM15] to show undecidability of the same problem in the "unrestricted case" -- when database instances are allowed to be infinite. We also show a specific instance Q0, Q= \Q1, Q2, ... Qk} such that the set Q of CQs does not determine CQQ0 but finitely determines it. Finally, we claim that while Q0 is finitely determined by Q, there is no FO-rewriting of Q0, with respect to Q

22 citations

Proceedings Article
01 Jan 2014
TL;DR: This work proposes a novel benchmark for OBDA systems based on the Norwegian Petroleum Directorate (NPD), and comes with novel techniques to generate, from available data, datasets of increasing size, taking into account the requirements dictated by the OBDA setting.
Abstract: In Ontology-Based Data Access (OBDA), queries are posed over a high-level conceptual view, and then translated into queries over a potentially very large (usually relational) data source. The ontology is connected to the data sources through a declarative specification given in terms of mappings. Although prototype OBDA systems providing the ability to answer SPARQL queries over the ontology are available, a significant challenge remains: performance. To properly evaluate OBDA systems, benchmarks tailored towards the requirements in this setting are needed. OWL benchmarks, which have been developed to test the performance of generic SPARQL query engines, however, fail to evaluate OBDA specific features. In this work, we propose a novel benchmark for OBDA systems based on the Norwegian Petroleum Directorate (NPD). Our benchmark comes with novel techniques to generate, from available data, datasets of increasing size, taking into account the requirements dictated by the OBDA setting. We validate our benchmark on significant OBDA systems, showing that it is more adequate than previous benchmarks not tailored for OBDA.

15 citations

Journal ArticleDOI
TL;DR: This contribution deals with systematic exploitation of logical reduction techniques to big distributed data handling, which extends and generalizes the known propagation techniques and introduces the notion of strongly distributed databases.
Abstract: This contribution deals with systematic exploitation of logical reduction techniques to big distributed data handling. The particular applications are views and parallel updates over large-scale distributed databases as well as handling of queries over different generations of databases. Logical reduction techniques come in two favors. The first one: the syntactically defined translation schemes, which describe transformations of database schemes. They give rise to two induced maps, translations and transductions. Transductions describe the induced transformation of database instances and the translations describe the induced transformations of queries. The second one: Feferman-Vaught reductions, which are applied in situations of distributed databases. The reduction describes how the queries over a distributed database can be computed from queries over the components and queries over the index set. Combination and development of these techniques allow us to introduce the notion of strongly distributed databases. For such databases, we extend and generalize the known propagation techniques. The method allows unification of the distributed and parallel computation and communication as well as significant reduction of the communication load. The proposed general approach may be easily adopted to other distributed objects and their integration into large-scale systems. Copyright © 2015 John Wiley & Sons, Ltd.

5 citations

References
More filters
Book ChapterDOI
08 Jan 2003
TL;DR: The notion of "certain answers" in indefinite databases for the semantics for query answering in data exchange is adopted and the computational complexity of computing the certain answers in this context is investigated.
Abstract: Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to query answering in the context of data exchange. These issues arise because, given a source instance, there may be many target instances that satisfy the constraints of the data exchange problem. We give an algebraic specification that selects, among all solutions to the data exchange problem, a special class of solutions that we call universal. A universal solution has no more and no less data than required for data exchange and it represents the entire space of possible solutions. We then identify fairly general, and practical, conditions that guarantee the existence of a universal solution and yield algorithms to compute a canonical universal solution efficiently. We adopt the notion of "certain answers" in indefinite databases for the semantics for query answering in data exchange. We investigate the computational complexity of computing the certain answers in this context and also study the problem of computing the certain answers of target queries by simply evaluating them on a canonical universal solution.

916 citations


Additional excerpts

  • ...Algorithm 2...

    [...]

Journal ArticleDOI
TL;DR: The main result of the paper states that, given a complete set U of view updates, U has a translator if and only if U is translatable under constant complement.
Abstract: A database view is a portion of the data structured in a way suitable to a specific application. Updates on views must be translated into updates on the underlying database. This paper studies the translation process in the relational model.The procedure is as follows: first, a “complete” set of updates is defined such that together with every update the set contains a “return” update, that is, one that brings the view back to the original state;given two updates in the set, their composition is also in the set.To translate a complete set, we define a mapping called a “translator,” that associates with each view update a unique database update called a “translation.” The constraint on a translation is to take the database to a state mapping onto the updated view. The constraint on the translator is to be a morphism.We propose a method for defining translators. Together with the user-defined view, we define a “complementary” view such that the database could be computed from the view and its complement. We show that a view can have many different complements and that the choice of a complement determines an update policy. Thus, we fix a view complement and we define the translation of a given view update in such a way that the complement remains invariant (“translation under constant complement”). The main result of the paper states that, given a complete set U of view updates, U has a translator if and only if U is translatable under constant complement.

621 citations


"The View Update Problem Revisited" refers background in this paper

  • ...With this work we contribute the following: • A general view update framework, based on the notion of determinacy, that constructively revisits [1] in a relational setting with constraints....

    [...]

  • ...More precisely, given two schemas S and T , a set of embedded dependencies Γ over S ∪ T , and an input CQ q over S, the C&B outputs the CQs over T which are equivalent to q under Γ....

    [...]

  • ...A general and precise understanding of the view update problem is due to Bancilhon and Spyratos [1], who provide an elegant solution to it within an abstract functional framework....

    [...]

  • ...Embedded dependencies are expressive enough to capture virtually all other classes of dependencies studied in the literature [5]....

    [...]

Proceedings ArticleDOI
09 Jun 2008
TL;DR: The standard chase procedure is revisited, and the extended core chase is introduced, i.e. finds an F-universal model set when it exists, and a key advantage of the new chase is that the same algorithm can be applied for all mapping classes F of interest, simply by modifying the set of constraints given as input.
Abstract: We revisit the standard chase procedure, studying its properties and applicability to classical database problems. We settle (in the negative) the open problem of decidability of termination of the standard chase, and we provide sufficient termination conditions which are strictly less over-conservative than the best previously known. We investigate the adequacy of the standard chase for checking query containment under constraints, constraint implication and computing certain answers in data exchange, gaining a deeper understanding by separating the algorithm from its result. We identify the properties of the chase result that are essential to the above applications, and we introduce the more general notion of F-universal model set, which supports query and constraint languages that are closed under a class F of mappings. By choosing F appropriately, we extend prior results to existential first-order queries and ∀∃-firstorder constraints. We show that the standard chase is incomplete for finding universal model sets, and we introduce the extended core chase which is complete, i.e. finds an F-universal model set when it exists. A key advantage of the new chase is that the same algorithm can be applied for all mapping classes F of interest, simply by modifying the set of constraints given as input. Even when restricted to the typical input in prior work, the new chase supports certain answer computation and containment/implication tests in strictly more cases than the incomplete standard chase.

371 citations


"The View Update Problem Revisited" refers background in this paper

  • ...Similarly, we also have that R2(x, y, z) can be rewritten in terms of V as ∃v V2(v, x, y) ∧ V3(x, z)....

    [...]

Book ChapterDOI
08 Jan 2003
TL;DR: A completeness theorem is proved which guarantees that under certain conditions, this algorithm will find a minimal reformulation if one exists and conditions when this algorithm achieves optimal complexity bounds are identified.
Abstract: We state and solve the query reformulation problem for XML publishing in a general setting that allows mixed (XML and relational) storage for the proprietary data and exploits redundancies (materialized views, indexes and caches) to enhance performance. The correspondence between published and proprietary schemas is specified by views in both directions, and the same algorithm performs rewriting-with-views, composition-with-views, or the combined effect of both, unifying the Global-As-View and Local-As-View approaches to data integration. We prove a completeness theorem which guarantees that under certain conditions, our algorithm will find a minimal reformulation if one exists. Moreover, we identify conditions when this algorithm achieves optimal complexity bounds. We solve the reformulation problem for constraints by exploiting a reduction to the problem of query reformulation.

226 citations


Additional excerpts

  • ...Algorithm 2...

    [...]

Proceedings ArticleDOI
29 Jun 2009
TL;DR: A notion of generalized Schema-mapping that enriches the standard schema-mappings (as defined by Fagin et al) with more expressive power is introduced and a more general and arguably more intuitive notion of semantics that rely on three criteria: Soundness, Completeness and Laconicity are proposed.
Abstract: Data-Exchange is the problem of creating new databases according to a high-level specification called a schema-mapping while preserving the information encoded in a source database. This paper introduces a notion of generalized schema-mapping that enriches the standard schema-mappings (as defined by Fagin et al) with more expressive power. It then proposes a more general and arguably more intuitive notion of semantics that rely on three criteria: Soundness, Completeness and Laconicity (non-redundancy and minimal size). These semantics are shown to coincide precisely with the notion of cores of universal solutions in the framework of Fagin, Kolaitis and Popa. It is also well-defined and of interest for larger classes of schema-mappings and more expressive source databases (with null-values and equality constraints). After an investigation of the key properties of generalized schema-mappings and their semantics, a criterion called Termination of the Oblivious Chase (TOC) is identified that ensures polynomial data-complexity. This criterion strictly generalizes the previously known criterion of Weak-Acyclicity. To prove the tractability of TOC schema-mappings, a new polynomial time algorithm is provided that, unlike the algorithm of Gottlob and Nash from which it is inspired, does not rely on the syntactic property of Weak-Acyclicity. As the problem of deciding whether a Schema-mapping satisfies the TOC criterion is only recursively enumerable, a more restrictive criterion called Super-weak Acylicity (SwA) is identified that can be decided in Polynomial-time while generalizing substantially the notion of Weak-Acyclicity.

211 citations


"The View Update Problem Revisited" refers background in this paper

  • ..., super-weak acyclicity [17], c-stratification [18], safety and inductive restriction [19], some of which extend stratification and some other are incomparable with it....

    [...]