scispace - formally typeset
Search or ask a question
Topic

Tuple

About: Tuple is a research topic. Over the lifetime, 6513 publications have been published within this topic receiving 146057 citations. The topic is also known as: tuple & ordered tuplet.


Papers
More filters
Journal ArticleDOI
01 Jul 2012
TL;DR: It is proved that functional dependencies are subsumed by order dependencies and that the set of axioms for order dependencies is sound and complete.
Abstract: Dependencies have played a significant role in database design for many years. They have also been shown to be useful in query optimization. In this paper, we discuss dependencies between lexicographically ordered sets of tuples. We introduce formally the concept of order dependency and present a set of axioms (inference rules) for them. We show how query rewrites based on these axioms can be used for query optimization. We present several interesting theorems that can be derived using the inference rules. We prove that functional dependencies are subsumed by order dependencies and that our set of axioms for order dependencies is sound and complete.

42 citations

Proceedings ArticleDOI
11 Apr 2011
TL;DR: This paper proposes a novel Semi-Streaming Index Join (SSIJ) algorithm that maximizes the throughput of the join by buffering stream tuples and then judiciously selecting how to best amortize expensive disk seeks for blocks of the stored relation among a large number of stream Tuples.
Abstract: Active data warehouses have emerged as a new business intelligence paradigm where data in the integrated repository is refreshed in near real-time. This shift of practices achieves higher consistency between the stored information and the latest updates, which in turn influences crucially the output of decision making processes. In this paper we focus on the changes required in the implementation of Extract Transform Load (ETL) operations which now need to be executed in an online fashion. In particular, the ETL transformations frequently include the join between an incoming stream of updates and a disk-resident table of historical data or metadata. In this context we propose a novel Semi-Streaming Index Join (SSIJ) algorithm that maximizes the throughput of the join by buffering stream tuples and then judiciously selecting how to best amortize expensive disk seeks for blocks of the stored relation among a large number of stream tuples. The relation blocks required for joining with the stream are loaded from disk based on an optimal plan. In order to maximize the utilization of the available memory space for performing the join, our technique incorporates a simple but effective cache replacement policy for managing the retrieved blocks of the relation. Moreover, SSIJ is able to adapt to changing characteristics of the stream (i.e. arrival rate, data distribution) by dynamically adjusting the allocated memory between the cached relation blocks and the stream. Our experiments with a variety of synthetic and real data sets demonstrate that SSIJ consistently outperforms the state-of-the-art algorithm in terms of the maximum sustainable throughput of the join while being also able to accommodate deadlines on stream tuple processing.

41 citations

Proceedings ArticleDOI
09 May 2017
TL;DR: This paper thoroughly investigates the satisfiability and learning problems in a variety of settings, and derives insight on how the different facets of the problem interplay with the size of the database, thereby providing the theoretical foundations necessary for a future implementation of query learning from examples.
Abstract: This paper investigates the problem of reverse engineering, i.e., learning, select-project-join (SPJ) queries from a user-provided example set, containing positive and negative tuples. The goal is then to determine whether there exists a query returning all the positive tuples, but none of the negative tuples, and furthermore, to find such a query, if it exists. These are called the satisfiability and learning problems, respectively. The ability to solve these problems is an important step in simplifying the querying process for non-expert users.This paper thoroughly investigates the satisfiability and learning problems in a variety of settings. In particular, we consider several classes of queries, which allow different combinations of the operators select, project and join. In addition, we compare the complexity of satisfiability and learning, when the query is, or is not, of bounded size. We note that bounded-size queries are of particular interest, as they can be used to avoid over-fitting (i.e., tailoring a query precisely to only the seen examples).In order to fully understand the underlying factors which make satisfiability and learning (in)tractable, we consider different components of the problem, namely, the size of a query to be learned, the size of the schema and the number of examples. We study the complexity of our problems, when considering these as part of the input, as constants or as parameters (i.e., as in parameterized complexity analysis). Depending on the setting, the complexity of satisfiability and learning can vary significantly. Among other results, our analysis also provides new problems that are complete for W[3], for which few natural problems are known. Finally, by considering a variety of settings, we derive insight on how the different facets of our problem interplay with the size of the database, thereby providing the theoretical foundations necessary for a future implementation of query learning from examples.

41 citations

Book ChapterDOI
15 Jun 1987
TL;DR: In this article, two semantic models for data flow nets are given: the first model describes the semantics of a data flow net as a function from (tens of) sequences of tokens to sets of (tuples of) sequence of tokens.
Abstract: Two semantic models for data flow nets are given. The first model is an intuitive, operational model. This model has an important drawback: it is not compositional. An example given in [Brock & Ackerman 1981] shows the non-compositionality of our model. There exist two nets that have the same semantics, but when they are placed in a specific context, the semantics of the resulting nets differ. The second one is obtained by adding information to the first model. The amount of information is enough to make it compositional. Moreover, we show that we have added the minimal amount of information to make the model compositional: the second model is fully abstract with respect to the equivalence generated by the first model. To be more specific: the first model describes the semantics a data flow net as a function from (tuples of) sequences of tokens to sets of (tuples of) sequences of tokens. The second one maps a data flow net to a function from (tuples of) infinite sequences of finite words tO sets of (tuples of) infinite sequences of finite words.

41 citations

Proceedings ArticleDOI
06 Nov 2006
TL;DR: A novel approach to rank the query results of an E-commerce Web database based on how much the user cares about each attribute is proposed, which can effectively capture a user's preferences.
Abstract: To deal with the problem of too many results returned from an E-commerce Web database in response to a user query, this paper proposes a novel approach to rank the query results. Based on the user query, we speculate how much the user cares about each attribute and assign a corresponding weight to it. Then, for each tuple in the query result, each attribute value is assigned a score according to its "desirableness" to the user. These attribute value scores are combined according to the attribute weights to get a final ranking score for each tuple. Tuples with the top ranking scores are presented to the user first. Our ranking method is domain independent and requires no user feedback. Experimental results demonstrate that this ranking method can effectively capture a user's preferences.

41 citations


Network Information
Related Topics (5)
Graph (abstract data type)
69.9K papers, 1.2M citations
86% related
Time complexity
36K papers, 879.5K citations
85% related
Server
79.5K papers, 1.4M citations
83% related
Scalability
50.9K papers, 931.6K citations
83% related
Polynomial
52.6K papers, 853.1K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023203
2022459
2021210
2020285
2019306
2018266