Topic

Tuple

About: Tuple is a research topic. Over the lifetime, 6513 publications have been published within this topic receiving 146057 citations. The topic is also known as: tuple & ordered tuplet.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•

Semantic Data Caching and Replacement

[...]

Shaul Dar, Michael J. Franklin, Björn Þór Jónsson, Divesh Srivastava, Michael Tan - Show less +1 more

03 Sep 1996

TL;DR: A semantic model for client-side caching and replacement in a client-server database system and compared to page caching and tuple caching strategies is proposed and validated with a detailed performance study.

...read moreread less

Abstract: We propose a semantic model for client-side caching and replacement in a client-server database system and compare this approach to page caching and tuple caching strategies. Our caching model is based on, and derives its advantages from, three key ideas. First, the client maintains a semantic description of the data in its cache,which allows for a compact specification, as a remainder query, of the tuples needed to answer a query that are not available in the cache. Second, usage information for replacement policies is maintained in an adaptive fashion for semantic regions, which are associated with collections of tuples. This avoids the high overheads of tuple caching and, unlike page caching, is insensitive to bad clustering. Third, maintaining a semantic description of cached data enables the use of sophisticated value functions that incorporate semantic notions of locality, not just LRU or MRU, for cache replacement. We validate these ideas with a detailed performance study that includes traditional workloads as well as a workload motivated by a mobile navigation application.

...read moreread less

610 citations

Proceedings Article•

Relation Extraction with Matrix Factorization and Universal Schemas

[...]

Sebastian Riedel¹, Limin Yao², Andrew McCallum², Benjamin M. Marlin²•Institutions (2)

University College London¹, University of Massachusetts Amherst²

01 Jan 2013

TL;DR: In this article, a matrix factorization model is used to learn latent feature vectors for entity tuples and relations in a universal schema, which has an almost unlimited set of relations (due to surface forms).

...read moreread less

Abstract: © 2013 Association for Computational Linguistics. Traditional relation extraction predicts relations within some fixed and finite target schema. Machine learning approaches to this task require either manual annotation or, in the case of distant supervision, existing structured sources of the same schema. The need for existing datasets can be avoided by using a universal schema: the union of all involved schemas (surface form predicates as in OpenIE, and relations in the schemas of preexisting databases). This schema has an almost unlimited set of relations (due to surface forms), and supports integration with existing structured data (through the relation types of existing databases). To populate a database of such schema we present matrix factorization models that learn latent feature vectors for entity tuples and relations. We show that such latent models achieve substantially higher accuracy than a traditional classification approach. More importantly, by operating simultaneously on relations observed in text and in pre-existing structured DBs such as Freebase, we are able to reason about unstructured and structured data in mutually-supporting ways. By doing so our approach outperforms stateof- the-Art distant supervision.

...read moreread less

609 citations

Proceedings Article•DOI•

Packet classification using tuple space search

[...]

V. Srinivasan¹, Subhash Suri¹, George Varghese¹•Institutions (1)

Washington University in St. Louis¹

30 Aug 1999

TL;DR: The Pruned Tuple Space search is the only scheme known to us that allows fast updates and fast search times, and an optimal algorithm is described, called Rectangle Search, for two-dimensional filters.

...read moreread less

Abstract: Routers must perform packet classification at high speeds to efficiently implement functions such as firewalls and QoS routing. Packet classification requires matching each packet against a database of filters (or rules), and forwarding the packet according to the highest priority filter. Existing filter schemes with fast lookup time do not scale to large filter databases. Other more scalable schemes work for 2-dimensional filters, but their lookup times degrade quickly with each additional dimension. While there exist good hardware solutions, our new schemes are geared towards software implementation.We introduce a generic packet classification algorithm, called Tuple Space Search (TSS). Because real databases typically use only a small number of distinct field lengths, by mapping filters to tuples even a simple linear search of the tuple space can provide significant speedup over naive linear search over the filters. Each tuple is maintained as a hash table that can be searched in one memory access. We then introduce techniques for further refining the search of the tuple space, and demonstrate their effectiveness on some firewall databases. For example, a real database of 278 filters had a tuple space of 41 which our algorithm prunes to 11 tuples. Even as we increased the filter database size from 1K to 100K (using a random two-dimensional filter generation model), the number of tuples grew from 53 to only 186, and the pruned tuples only grew from 1 to 4. Our Pruned Tuple Space search is also the only scheme known to us that allows fast updates and fast search times. We also show a lower bound on the general tuple space search problem, and describe an optimal algorithm, called Rectangle Search, for two-dimensional filters.

...read moreread less

604 citations

Journal Article•DOI•

Schism: a workload-driven approach to database replication and partitioning

[...]

Carlo Curino¹, Evan P. C. Jones¹, Yang Zhang¹, Samuel Madden¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Sep 2010

TL;DR: Schism consistently outperforms simple partitioning schemes, and in some cases proves superior to the best known manual partitioning, reducing the cost of distributed transactions up to 30%.

...read moreread less

Abstract: We present Schism, a novel workload-aware approach for database partitioning and replication designed to improve scalability of shared-nothing distributed databases. Because distributed transactions are expensive in OLTP settings (a fact we demonstrate through a series of experiments), our partitioner attempts to minimize the number of distributed transactions, while producing balanced partitions. Schism consists of two phases: i) a workload-driven, graph-based replication/partitioning phase and ii) an explanation and validation phase. The first phase creates a graph with a node per tuple (or group of tuples) and edges between nodes accessed by the same transaction, and then uses a graph partitioner to split the graph into k balanced partitions that minimize the number of cross-partition transactions. The second phase exploits machine learning techniques to find a predicate-based explanation of the partitioning strategy (i.e., a set of range predicates that represent the same replication/partitioning scheme produced by the partitioner).The strengths of Schism are: i) independence from the schema layout, ii) effectiveness on n-to-n relations, typical in social network databases, iii) a unified and fine-grained approach to replication and partitioning. We implemented and tested a prototype of Schism on a wide spectrum of test cases, ranging from classical OLTP workloads (e.g., TPC-C and TPC-E), to more complex scenarios derived from social network websites (e.g., Epinions.com), whose schema contains multiple n-to-n relationships, which are known to be hard to partition. Schism consistently outperforms simple partitioning schemes, and in some cases proves superior to the best known manual partitioning, reducing the cost of distributed transactions up to 30%.

...read moreread less

602 citations

Journal Article•DOI•

Tane: An Efficient Algorithm for Discovering Functional and Approximate Dependencies

[...]

Yka Huhtala, Juha Kärkkäinen, Pasi Porkka, Hannu Toivonen

01 Jan 1999-The Computer Journal

TL;DR: TANE is an efficient algorithm for finding functional dependencies from large databases based on partitioning the set of rows with respect to their attribute values, which makes testing the validity of functional dependencies fast even for a large number of tuples.

...read moreread less

Abstract: The discovery of functional dependencies from relations is an important database analysis technique. We present TANE, an efficient algorithm for finding functional dependencies from large databases. TANE is based on partitioning the set of rows with respect to their attribute values, which makes testing the validity of functional dependencies fast even for a large number of tuples. The use of partitions also makes the discovery of approximate functional dependencies easy and efficient and the erroneous or exceptional rows can be identified easily. Experiments show that T ANE is fast in practice. For benchmark databases the running times are improved by several orders of magnitude over previously published results. The algorithm is also applicable to much larger datasets than the previous methods.

...read moreread less

602 citations

Collapse

Network Information

Performance

Metrics

7,188

Papers

157,520

Citations

No. of papers in the topic in previous years
Year	Papers
2023	203
2022	459
2021	210
2020	285
2019	306
2018	266

Tuple

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics