scispace - formally typeset
Search or ask a question
Topic

Tuple

About: Tuple is a research topic. Over the lifetime, 6513 publications have been published within this topic receiving 146057 citations. The topic is also known as: tuple & ordered tuplet.


Papers
More filters
Proceedings ArticleDOI
06 Jun 2010
TL;DR: This paper proposes a new paradigm for explaining a why-not question that is based on automatically generating a refined query whose result includes both the original query's result as well as the user-specified missing tuple(s).
Abstract: One useful feature that is missing from today's database systems is an explain capability that enables users to seek clarifications on unexpected query results. There are two types of unexpected query results that are of interest: the presence of unexpected tuples, and the absence of expected tuples (i.e., missing tuples). Clearly, it would be very helpful to users if they could pose follow-up why and why-not questions to seek clarifications on, respectively, unexpected and expected (but missing) tuples in query results. While the why questions can be addressed by applying established data provenance techniques, the problem of explaining the why-not questions has received very little attention. There are currently two explanation models proposed for why-not questions. The first model explains a missing tuple t in terms of modifications to the database such that t appears in the query result wrt the modified database. The second model explains by identifying the data manipulation operator in the query evaluation plan that is responsible for excluding t from the result. In this paper, we propose a new paradigm for explaining a why-not question that is based on automatically generating a refined query whose result includes both the original query's result as well as the user-specified missing tuple(s). In contrast to the existing explanation models, our approach goes beyond merely identifying the "culprit" query operator responsible for the missing tuple(s) and is useful for applications where it is not appropriate to modify the database to obtain missing tuples.

179 citations

Proceedings ArticleDOI
01 Sep 2006
TL;DR: A new type of drop operator is introduced, called a "Window Drop", which logically divides the input stream into windows and probabilistically decides which windows to drop, and always delivers subsets of original query answers with minimal degradation in result quality.
Abstract: Data stream management systems may be subject to higher input rates than their resources can handle. When overloaded, the system must shed load in order to maintain low-latency query results. In this paper, we describe a load shedding technique for queries consisting of one or more aggregate operators with sliding windows. We introduce a new type of drop operator, called a "Window Drop". This operator is aware of the window properties (i.e., window size and window slide) of its downstream aggregate operators in the query plan. Accordingly, it logically divides the input stream into windows and probabilistically decides which windows to drop. This decision is further encoded into tuples by marking the ones that are disallowed from starting new windows. Unlike earlier approaches, our approach preserves integrity of windows throughout a query plan, and always delivers subsets of original query answers with minimal degradation in result quality.

178 citations

Proceedings Article
01 Jul 1993
TL;DR: In this paper, a probabilistic indexing framework is proposed for homology detection based on a table look-up paradigm, which uses the sequences of interest to generate a highly redundant number of very descriptive tuples, which are subsequently used as indices in a table lookup paradigm.
Abstract: A key issue in managing today's large amounts of genetic data is the availability of efficient, accurate, and selective techniques for detecting homologies (similarities) between newly discovered and already stored sequences. A common characteristic of today's most advanced algorithms, such as FASTA, BLAST, and BLAZE is the need to scan the contents of the entire database, in order to find one or more matches. This design decision results in either excessively long search times or, as is the case of BLAST, in a sharp trade-off between the achieved accuracy and the required amount of computation. The homology detection algorithm presented in this paper, on the other hand, is based on a probabilistic indexing framework. The algorithm requires minimal access to the database in order to determine matches. This minimal requirement is achieved by using the sequences of interest to generate a highly redundant number of very descriptive tuples; these tuples are subsequently used as indices in a table look-up paradigm. In addition to the description of the algorithm, theoretical and experimental results on the sensitivity and accuracy of the suggested approach are provided. The storage and computational requirements are described and the probability of correct matches and false alarms is derived. Sensitivity and accuracy are shown to be close to those of dynamic programming techniques. A prototype system has been implemented using the described ideas. It contains the full Swiss-Prot database rel 25 (10 MR) and the genome of E. Coli (2 MR). The system is currently being expanded to include the complete Genbank database.(ABSTRACT TRUNCATED AT 250 WORDS)

177 citations

Proceedings ArticleDOI
07 Apr 2008
TL;DR: This work introduces novel polynomial algorithms for processing top-k queries in uncertain databases under the generally adopted model of x-relations, and introduces the first-known polynometric algorithms, while the current best algorithms have exponential complexity in both time and space.
Abstract: This work introduces novel polynomial-time algorithms for processing top-k queries in uncertain databases, under the generally adopted model of x-relations. An x-relation consists of a number of x-tuples, and each x-tuple randomly instantiates into one tuple from one or more alternatives. Our results significantly improve the best known algorithms for top-k query processing in uncertain databases, in terms of both running time and memory usage. Focusing on the single-alternative case, the new algorithms are orders of magnitude faster.

176 citations

Proceedings Article
09 Jul 2016
TL;DR: A decade of progress on building Open IE extractors is described, which results in the latest extractor, OPENIE4, which is computationally efficient, outputs n-ary and nested relations, and also outputs relations mediated by nouns in addition to verbs.
Abstract: Open Information Extraction (Open IE) extracts textual tuples comprising relation phrases and argument phrases from within a sentence, without requiring a pre-specified relation vocabulary In this paper we first describe a decade of our progress on building Open IE extractors, which results in our latest extractor, OPENIE4, which is computationally efficient, outputs n-ary and nested relations, and also outputs relations mediated by nouns in addition to verbs We also identify several strengths of the Open IE paradigm, which enable it to be a useful intermediate structure for end tasks We survey its use in both human-facing applications and downstream NLP tasks, including event schema induction, sentence similarity, text comprehension, learning word vector embeddings, and more

175 citations


Network Information
Related Topics (5)
Graph (abstract data type)
69.9K papers, 1.2M citations
86% related
Time complexity
36K papers, 879.5K citations
85% related
Server
79.5K papers, 1.4M citations
83% related
Scalability
50.9K papers, 931.6K citations
83% related
Polynomial
52.6K papers, 853.1K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023203
2022459
2021210
2020285
2019306
2018266