# Symmetric relations and cardinality-bounded multisets in database systems

31 Aug 2004-pp 912-923

TL;DR: This work argues that the database system itself should provide native support for cardinality-bounded multisets, and provides techniques to be implemented by the database engine that avoid the drawbacks, and allow a schema designer to simply declare a table to be symmetric in certain attributes.

Abstract: In a binary symmetric relationship, A is related to B if and only if B is related to A. Symmetric relationships between k participating entities can be represented as multisets of cardinality k. Cardinality-bounded multisets are natural in several real-world applications. Conventional representations in relational databases suffer from several consistency and performance problems. We argue that the database system itself should provide native support for cardinality-bounded multisets. We provide techniques to be implemented by the database engine that avoid the drawbacks, and allow a schema designer to simply declare a table to be symmetric in certain attributes. We describe a compact data structure, and update methods for the structure. We describe an algebraic symmetric closure operator, and show how it can be moved around in a query plan during query optimization in order to improve performance. We describe indexing methods that allow efficient lookups on the symmetric columns. We show how to perform database normalization in the presence of symmetric relations. We provide techniques for inferring that a view is symmetric. We also describe a syntactic SQL extension that allows the succinct formulation of queries over symmetric relations.

##### Citations

More filters

••

TL;DR: An attempt is made to generalize the concepts of relation, function, composition and equivalence in the multiset context with a pre-requisite a brief survey of the axiomatic approach to the mult iset theory.

65 citations

Barry University

^{1}TL;DR: In this article, the authors delineate the problem related to difference and complementation in multiset theory and show that none of the existing approaches succeed in resolving the attendant difficulties without assuming some contrived stipulations.

Abstract: The paper delineates the problem related to difference and complementation in Multiset Theory. It is shown that none of the existing approaches succeeds in resolving the attendant difficulties without assuming some contrived stipulations. Mathematics Subject Classifications: 03E15, 03E20, 03E30

30 citations

01 Jan 2010

TL;DR: It was demonstrated statistically that algorithms proposed for texts’ analogy estimation provide “practically trustworthy” conclusions in the knowledge testing area.

Abstract: The origin for the classification of analogy is a comparison procedure we use to make a conclusion about similarity between the source and the target. Whenever the comparison procedure is not clarified, the analogy remains ambiguous. But if this procedure is formalized the analogy may allow to formulate and check conditions of its (still partial, but often very high) confidence. Below I discuss some ideas, which grounded certain algorithms allowed to calculate different kinds of similarity between texts, in particular of natural languages. The literal similarity of two words, as well as the lexical similarity of two given texts, can be estimated using so called “Theory of Finite Sequences Similarity”. The structural similarity can be measured by fixing certain name groups in the source text and check the cohesion (proximity) of names belonging to corresponding groups in the target. Special logical formalism called “The Language of Ternary Description” can provide good templates for source text structuring when comparing texts of natural languages. It was demonstrated statistically that algorithms proposed for texts’ analogy estimation provide “practically trustworthy” conclusions in the knowledge testing area. I argue also that high degree of confidence for that type of analogy is connected with the background of analogy (the notion which was discussed by G. Polya) though that background need not to be evidently formalized.

12 citations

••

23 Sep 2013

TL;DR: In the Zermelo-Fraenkel framework, it is proved that the set of all generalized multisets over a certain finite set is a finitely-generated, lattice-ordered, free abelian group.

Abstract: We present generalized multisets in the Zermelo-Fraenkel framework, in Reverse Mathematics, and in the Fraenkel-Mostowski framework. In the Zermelo-Fraenkel framework, we prove that the set of all generalized multisets over a certain finite set is a finitely-generated, lattice-ordered, free abelian group. Similar properties are then discussed in Reverse Mathematics. Finally, we study the generalized multisets in the Fraenkel-Mostowski framework, and present their nominal properties. Several Zermelo-Fraenkel algebraic properties of generalized multisets are translated into the Fraenkel-Mostowski framework by using the finite support axiom of the Fraenkel-Mostowski set theory.

10 citations

### Cites methods from "Symmetric relations and cardinality..."

...Multisets are also used in database theory [17], [30] and in...

[...]

••

TL;DR: The state of the art in the theory of multisets, i.e., mathematical models of sets with repetitions (duplicates or copies of their elements), can be found in this article.

Abstract: This paper reviews the state of the art in the theory of multisets, i.e., mathematical models of sets with repetitions (duplicates or copies of their elements). The corresponding bibliography is categorized as follows: the general theory of multisets, reviews, and application of multisets, in particular, in computer science.

5 citations

##### References

More filters

•

01 Jan 1979

TL;DR: This book goes into the details of database conception and use, it tells you everything on relational databases from theory to the actual used algorithms.

Abstract: This book goes into the details of database conception and use, it tells you everything on relational databases. from theory to the actual used algorithms.

2,475 citations

•

01 Jan 1989

1,586 citations

•

03 Sep 1996TL;DR: A new relational algebra operation is proposed that represents several levels of aggregation over the same groups in an operand relation and a translation from the extended SQL language into the authors' algebraic language is described.

Abstract: Some aggregate and grouping queries are conceptually simple, but difficult to express in SQL. This difficulty causes both conceptual and implementation problems for the SQLbased database system. Complicated queries and views are hard to understand and maintain. Further, the code produced is sometimes unnecessarily inefficient, as we demonstrate experimentally using a commercial database system. In this paper, we examine a class of queries involving (potentially repeated) selection, grouping and aggregation over the same groups, and propose an extension of SQL syntax that allows the succinct representation of these queries. We propose a new relational algebra operation that represents several levels of aggregation over the same groups in an operand relation. We demonstrate that the extended relational operator can be evaluated using efficient algorithms. We describe a translation from the extended SQL language into our algebraic language. We have implemented a preprocessor that evaluates our extended language on top of a commercial l This research was supported by a grant from the AT&T Foundation, by a David and Lucile Packard Foundation Fellowship in Science and Engineering, by a Sloan Foundation Fellowship, by NSF grants IRI-9209029, CDA-90-24735, and by an NSF Young Investigator award. Permission to copy without fee all or part of this material is gmnted provided that the wpier are not made or distributed for direct commercial advantage, the VLDB wpyrinht notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 22nd VLDB Conference Mumbai(Bombay), India, 1996 database system. We demonstrate that on a variety of examples, our implementation improves performance over standard SQL representations of the same examples by orders of magnitude.

102 citations

••

09 Jun 2003TL;DR: It is shown that the inverted file, a powerful index for selection queries, can also facilitate the efficient evaluation of most join predicates, and proposes join algorithms that utilize inverted files and compare them with signature-based methods for several set-comparison predicates.

Abstract: Object-oriented and object-relational DBMS support set valued attributes, which are a natural and concise way to model complex information. However, there has been limited research to-date on the evaluation of query operators that apply on sets. In this paper we study the join of two relations on their set-valued attributes. Various join types are considered, namely the set containment, set equality, and set overlap joins. We show that the inverted file, a powerful index for selection queries, can also facilitate the efficient evaluation of most join predicates. We propose join algorithms that utilize inverted files and compare them with signature-based methods for several set-comparison predicates.

93 citations

••

TL;DR: A restriction under which every domain independent formula is evaluable is described and it is argued that the class of evaluable formulas is the largest decidable subclass of the domain independent formulas that can be efficiently recognized.

Abstract: Not all queries in relational calculus can be answered sensibly when disjunction, negation, and universal quantification are allowed. The class of relation calculus queries or formulas that have sensible answers is called the domain independent class which is known to be undecidable. Subsequent research has focused on identifying large decidable subclasses of domain independent formulas. In this paper we investigate the properties of two such classes: the evaluable formulas and the allowed formulas. Although both classes have been defined before, we give simplified definitions, present short proofs of their main properties, and describe a method to incorporate equality.Although evaluable queries have sensible answers, it is not straightforward to compute them efficiently or correctly. We introduce relational algebra normal form for formulas from which form the correct translation into relational algebra is trivial. We give algorithms to transform an evaluable formula into an equivalent allowed formula and from there into relational algebra normal form. Our algorithms avoid use of the so-called Dom relation, consisting of all constants appearing in the database or the query.Finally, we describe a restriction under which every domain independent formula is evaluable and argue that the class of evaluable formulas is the largest decidable subclass of the domain independent formulas that can be efficiently recognized.

93 citations