scispace - formally typeset
Search or ask a question
Topic

Tuple

About: Tuple is a research topic. Over the lifetime, 6513 publications have been published within this topic receiving 146057 citations. The topic is also known as: tuple & ordered tuplet.


Papers
More filters
Patent
12 Aug 2002
TL;DR: In this article, a query relating to a database is received, and an approximate answer to the query is generated such that the approximate answer is based on at least one join synopsis formed from the database.
Abstract: A method for generating an approximate answer to a query in a database environment in which the database has a plurality of base relations A query relating to a database is received, and an approximate answer to the query is generated such that the approximate answer is based on at least one join synopsis formed from the database The method further includes steps of forming a sample-tuple set for at least one selected base relation of a plurality of base relations of a database such that each sample-tuple set contains at least one sample tuple from a corresponding base relation, and forming a join synopsis set for each selected base relation such that each join synopsis set contains a join synopsis for each sample tuple in a sample-tuple set A join synopsis of a sample tuple is based on a join of the sample tuple and at least one descendent relation of the sample tuple All join synopsis sets form a statistical summary of the database and are stored

37 citations

01 Jan 2006
TL;DR: It is shown that even full dependencies that are not limited to be source-to-target are not closed under composition and that determining whether the composition can be given by these kinds of dependencies is undecidable.
Abstract: We study three fundamental problems in information integration: (1) the data integration query problem, (2) the data exchange core computation problem, and (3) the schema mapping composition problem. The first problem consists of computing the certain answers to a query over a target schema for a source instance under constraints which relate the source and target schemas. We show how to compute certain answers for a larger family of constraints and queries than those previously addressed. One of the main tools is the chase, which we study and extend significantly. The second problem deals with inserting data from one database into another database having a different schema. Fagin, Kolaitis, and Popa have shown that among the universal solutions of a solvable data exchange problem, there exists---up to isomorphism---a most compact one, "the core", and have convincingly argued that this core should be the database to be materialized. We show how to compute the core in the general setting where the mapping between the source and target schemas is given by source-to-target constraints which are arbitrary tuple generating dependencies (TGDs) and target constraints consisting of equality generating dependencies (EGDs) and weakly-acyclic TGDs. The third problem, composition of mappings between schemas, is essential to support schema evolution, data exchange, data integration, and other data management tasks. We study the issues involved in composing schema mappings given by embedded dependencies that need not be source-to-target and we concentrate on obtaining (first-order) embedded dependencies. We provide a composition algorithm and several negative results. In particular, we show that even full dependencies that are not limited to be source-to-target are not closed under composition and that determining whether the composition can be given by these kinds of dependencies is undecidable. These negative results carry over to mappings given by embedded dependencies.

37 citations

Patent
17 May 2002
TL;DR: In this paper, a statistical translation memory (TMEM) is generated by training a translation model with a naturally generated TMEM and a number of tuples may be extracted from each translation pair in the TMEM.
Abstract: A statistical translation memory (TMEM) may be generated by training a translation model with a naturally generated TMEM. A number of tuples may be extracted from each translation pair in the TMEM. The tuples may include a phrase in a source language and a corresponding phrase in a target language. The tuples may also include probability information relating to the phrases generated by the translation model.

37 citations

08 Feb 2016
TL;DR: A tuple expansion procedure which reconstructs rich information from semantically poor SQL data types such as strings, integers, and floating point numbers is described, and this procedure is used as the foundation of a new user-guided outlier detection framework, dBoost, which relies on inference and statistical modeling of heterogeneous data to flag suspicious fields in database tuples.
Abstract: Rapidly developing areas of information technology are generating massive amounts of data. Human errors, sensor failures, and other unforeseen circumstances unfortunately tend to undermine the quality and consistency of these datasets by introducing outliers – data points that exhibit surprising behavior when compared to the rest of the data. Characterizing, locating, and in some cases eliminating these outliers offers interesting insight about the data under scrutiny and reinforces the confidence that one may have in conclusions drawn from otherwise noisy datasets. In this paper, we describe a tuple expansion procedure which reconstructs rich information from semantically poor SQL data types such as strings, integers, and floating point numbers. We then use this procedure as the foundation of a new user-guided outlier detection framework, dBoost, which relies on inference and statistical modeling of heterogeneous data to flag suspicious fields in database tuples. We show that this novel approach achieves good classification performance, both in traditional numerical datasets and in highly non-numerical contexts such as mostly textual datasets. Our implementation is publicly available, under version 3 of the GNU General Public License.

37 citations

Proceedings ArticleDOI
01 Dec 2020
TL;DR: This work introduces a span-based joint extraction framework with attention-based semantic representations that outperforms previous systems and achieves state-of-the-art results on ACE2005, CoNLL2004 and ADE.
Abstract: Span-based joint extraction models have shown their efficiency on entity recognition and relation extraction. These models regard text spans as candidate entities and span tuples as candidate relation tuples. Span semantic representations are shared in both entity recognition and relation extraction, while existing models cannot well capture semantics of these candidate entities and relations. To address these problems, we introduce a span-based joint extraction framework with attention-based semantic representations. Specially, attentions are utilized to calculate semantic representations, including span-specific and contextual ones. We further investigate effects of four attention variants in generating contextual semantic representations. Experiments show that our model outperforms previous systems and achieves state-of-the-art results on ACE2005, CoNLL2004 and ADE.

37 citations


Network Information
Related Topics (5)
Graph (abstract data type)
69.9K papers, 1.2M citations
86% related
Time complexity
36K papers, 879.5K citations
85% related
Server
79.5K papers, 1.4M citations
83% related
Scalability
50.9K papers, 931.6K citations
83% related
Polynomial
52.6K papers, 853.1K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023203
2022459
2021210
2020285
2019306
2018266