scispace - formally typeset
Search or ask a question
Topic

Tuple

About: Tuple is a research topic. Over the lifetime, 6513 publications have been published within this topic receiving 146057 citations. The topic is also known as: tuple & ordered tuplet.


Papers
More filters
Proceedings Article
27 Aug 1998
TL;DR: It is shown that the lub may not qualify as a reduced version of the given set of tuples, but the interior cover - the subset of internal elements covered by the lubs does qualify, and the theoretical result that such an interior cover exists is established.
Abstract: Data reduction makes datasets smaller but preserves classification structures of interest. In this paper we present a novel approach to data reduction based on lattice and hyper relations. Hyper relations are a generalization of conventional database relations in the sense that we allow sets of values as tuple entries. The advantage of this is that raw data and reduced data can both be represented by hyper relations. The collection of hyper relations can be naturally made into a complete Boolean algebra, and so for any collection of hyper tuples we can find its unique least upper bound (lub) as a reduction of it. We show that the lub may not qualify as a reduced version of the given set of tuples, but the interior cover - the subset of internal elements covered by the lub- does qualify. We establish the theoretical result that such an interior cover exists, and find a way to find it. The proposed method was evaluated using 7 real world datasets. The results were quite remarkable compared with those obtained by C4.5, and the datasets were reduced with reduction ratios up to 99%.

37 citations

Journal ArticleDOI
TL;DR: In this approach, test predicates are used to formalize combinatorial testing as a logical problem, and an external formal logic tool is applied to solve it, effectively handled by the same tool.
Abstract: Combinatorial testing is as an effective testing technique to reveal failures in a given system, based on input combinations coverage and combinatorial optimization. Combinatorial testing of strength t (t???2) requires that each t-wise tuple of values of the different system input parameters is covered by at least one test case. Combinatorial test suite generation algorithms aim at producing a test suite covering all the required tuples in a small (possibly minimal) number of test cases, in order to reduce the cost of testing. The most used combinatorial technique is the pairwise testing (t?=?2) which requires coverage of all pairs of input values. Constrained combinatorial testing takes also into account constraints over the system parameters, for instance forbidden tuples of inputs, modeling invalid or not realizable input values combinations. In this paper a new approach to combinatorial testing, tightly integrated with formal logic, is presented. In this approach, test predicates are used to formalize combinatorial testing as a logical problem, and an external formal logic tool is applied to solve it. Constraints over the input domain are expressed as logical predicates too, and effectively handled by the same tool. Moreover, inclusion or exclusion of select tuples is supported, allowing the user to customize the test suite layout. The proposed approach is supported by a prototype tool implementation and results of experimental assessment are also presented.

37 citations

Proceedings ArticleDOI
02 Jun 1982
TL;DR: The proposed method appears to be generally advantageous in storage occupancy; in data retrieval operations it is extremely effective when joins between permanent relations are performed and good performances can be achieved with other relational operations using proper parallel architectures and, when temporary relations are involved, using special purpose devices.
Abstract: In this paper a method for relational database storage organization is presented.The method is based upon a disaggregation of the relations and a subsequent reaggregation to form the domains on which the relations are defined.A hierarchical organization of the domain is proposed in order to keep track of the relational entities (i.e. relations, tuples and attributes) that insist on the values present in the domains.Then we introduce an implementation technique, referred to as Data Pool, suitable to be processed by a database machine capable of "on the fly" track processing.Finally we present an analytic evaluation of the DP method and an example of database and query with performance comparison of the DP method with the most common flat file technique.The proposed method appears to be generally advantageous in storage occupancy; in data retrieval operations it is extremely effective when joins between permanent relations are performed. Good performances can be achieved with other relational operations using proper parallel architectures and, when temporary relations are involved, using special purpose devices.

37 citations

Proceedings Article
12 Sep 1994
TL;DR: In this paper, the authors propose a general strategy for the optimization of nested OOSQL queries in the algebraic language ADL, and by means of algebraic rewriting nested queries are transformed into join queries as far as possible.
Abstract: Most declarative SQL-like query languages for object-oriented database systems are orthogonal languages allowing for arbitrary nesting of expressions in the select-, from-, and where-clause. Expressions in the from-clause may be base tables as well as set-valued attributes. In this paper, we propose a general strategy for the optimization of nested OOSQL queries. As in the relational model, the translation/optimization goal is to move from tuple- to set-oriented query processing. Therefore, OOSQL is translated into the algebraic language ADL, and by means of algebraic rewriting nested queries are transformed into join queries as far as possible. Three different optimization options are described, and a strategy to assign priorities to options is proposed.

37 citations

Journal ArticleDOI
TL;DR: This article shows how to use Receiver Operating Characteristic (ROC) curves to estimate the extraction quality in a statistically robust way and how toUse ROC analysis to select the extraction parameters in a principled manner and presents analytic models that reveal how different document retrieval strategies affect the quality of the extracted relation.
Abstract: A large amount of structured information is buried in unstructured text. Information extraction systems can extract structured relations from the documents and enable sophisticated, SQL-like queries over unstructured text. Information extraction systems are not perfect and their output has imperfect precision and recall (i.e., contains spurious tuples and misses good tuples). Typically, an extraction system has a set of parameters that can be used as “knobs” to tune the system to be either precision- or recall-oriented. Furthermore, the choice of documents processed by the extraction system also affects the quality of the extracted relation. So far, estimating the output quality of an information extraction task has been an ad hoc procedure, based mainly on heuristics. In this article, we show how to use Receiver Operating Characteristic (ROC) curves to estimate the extraction quality in a statistically robust way and show how to use ROC analysis to select the extraction parameters in a principled manner. Furthermore, we present analytic models that reveal how different document retrieval strategies affect the quality of the extracted relation. Finally, we present our maximum likelihood approach for estimating, on the fly, the parameters required by our analytic models to predict the runtime and the output quality of each execution plan. Our experimental evaluation demonstrates that our optimization approach predicts accurately the output quality and selects the fastest execution plan that satisfies the output quality restrictions.

37 citations


Network Information
Related Topics (5)
Graph (abstract data type)
69.9K papers, 1.2M citations
86% related
Time complexity
36K papers, 879.5K citations
85% related
Server
79.5K papers, 1.4M citations
83% related
Scalability
50.9K papers, 931.6K citations
83% related
Polynomial
52.6K papers, 853.1K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023203
2022459
2021210
2020285
2019306
2018266