Topic

Tuple

About: Tuple is a research topic. Over the lifetime, 6513 publications have been published within this topic receiving 146057 citations. The topic is also known as: tuple & ordered tuplet.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Efficient IR-style keyword search over relational databases

[...]

Vagelis Hristidis¹, Luis Gravano², Yannis Papakonstantinou¹•Institutions (2)

University of California, San Diego¹, Columbia University²

09 Sep 2003

TL;DR: This paper adapts IR-style document-relevance ranking strategies to the problem of processing free-form keyword queries over RDBMSs, and develops query-processing strategies that build on a crucial characteristic of IR- style keyword search: only the few most relevant matches are generally of interest.

...read moreread less

Abstract: Applications in which plain text coexists with structured data are pervasive. Commercial relational database management systems (RDBMSs) generally provide querying capabilities for text attributes that incorporate state-of-the-art information retrieval (IR) relevance ranking strategies, but this search functionality requires that queries specify the exact column or columns against which a given list of keywords is to be matched. This requirement can be cumbersome and inflexible from a user perspective: good answers to a keyword query might need to be "assembled" -in perhaps unforeseen ways- by joining tuples from multiple relations. This observation has motivated recent research on free-form keyword search over RDBMSs. In this paper, we adapt IR-style document-relevance ranking strategies to the problem of processing free-form keyword queries over RDBMSs. Our query model can handle queries with both AND and OR semantics, and exploits the sophisticated single-column text-search functionality often available in commercial RDBMSs. We develop query-processing strategies that build on a crucial characteristic of IR-style keyword search: only the few most relevant matches -according to some definition of "relevance"- are generally of interest. Consequently, rather than computing all matches for a keyword query, which leads to inefficient executions, our techniques focus on the top-k matches for the query, for moderate values of k. A thorough experimental evaluation over real data shows the performance advantages of our approach.

...read moreread less

581 citations

Journal Article•DOI•

KLAIM: a kernel language for agents interaction and mobility

[...]

R. De Nicola¹, Gian Luigi Ferrari², Rosario Pugliese•Institutions (2)

University of Florence¹, University of Pisa²

01 May 1998-IEEE Transactions on Software Engineering

TL;DR: This work investigates the issue of designing a kernel programming language for mobile computing and describes KLAIM, a language that supports a programming paradigm where processes, like data, can be moved from one computing environment to another.

...read moreread less

Abstract: We investigate the issue of designing a kernel programming language for mobile computing and describe KLAIM, a language that supports a programming paradigm where processes, like data, can be moved from one computing environment to another. The language consists of a core Linda with multiple tuple spaces and of a set of operators for building processes. KLAIM naturally supports programming with explicit localities. Localities are first-class data (they can be manipulated like any other data), but the language provides coordination mechanisms to control the interaction protocols among located processes. The formal operational semantics is useful for discussing the design of the language and provides guidelines for implementations. KLAIM is equipped with a type system that statically checks access right violations of mobile agents. Types are used to describe the intentions (read, write, execute, etc.) of processes in relation to the various localities. The type system is used to determine the operations that processes want to perform at each locality, and to check whether they comply with the declared intentions and whether they have the necessary rights to perform the intended operations at the specific localities. Via a series of examples, we show that many mobile code programming paradigms can be naturally implemented in our kernel language. We also present a prototype implementation of KLAIM in Java.

...read moreread less

557 citations

Journal Article•DOI•

Approximate Query Processing Using Wavelets

[...]

Kaushik Chakrabarti¹, Minos Garofalakis², Rajeev Rastogi², Kyuseok Shim³•Institutions (3)

University of Illinois at Urbana–Champaign¹, Bell Labs², KAIST³

01 Sep 2001

TL;DR: The use of multi-dimensional wavelets are proposed as an effective tool for general-purpose approximate query processing in modern, high-dimensional applications and a novel wavelet decomposition algorithm is proposed that can build wavelet-coefficient synopses of the data in an I/O-efficient manner.

...read moreread less

Abstract: Approximate query processing has emerged as a cost-effective approach for dealing with the huge data volumes and stringent response-time requirements of today's decision support systems (DSS). Most work in this area, however, has so far been limited in its query processing scope, typically focusing on specific forms of aggregate queries. Furthermore, conventional approaches based on sampling or histograms appear to be inherently limited when it comes to approximating the results of complex queries over high-dimensional DSS data sets. In this paper, we propose the use of multi-dimensional wavelets as an effective tool for general-purpose approximate query processing in modern, high-dimensional applications. Our approach is based on building wavelet-coefficient synopses of the data and using these synopses to provide approximate answers to queries. We develop novel query processing algorithms that operate directly on the wavelet-coefficient synopses of relational tables, allowing us to process arbitrarily complex queries entirely in the wavelet-coefficient domain. This guarantees extremely fast response times since our approximate query execution engine can do the bulk of its processing over compact sets of wavelet coefficients, essentially postponing the expansion into relational tuples until the end-result of the query. We also propose a novel wavelet decomposition algorithm that can build these synopses in an I/O-efficient manner. Finally, we conduct an extensive experimental study with synthetic as well as real-life data sets to determine the effectiveness of our wavelet-based approach compared to sampling and histograms. Our results demonstrate that our techniques: (1) provide approximate answers of better quality than either sampling or histograms; (2) offer query execution-time speedups of more than two orders of magnitude; and (3) guarantee extremely fast synopsis construction times that scale linearly with the size of the data.

...read moreread less

556 citations

Proceedings Article•DOI•

Robust and efficient fuzzy match for online data cleaning

[...]

Surajit Chaudhuri¹, Kris Ganjam¹, Venkatesh Ganti¹, Rajeev Motwani²•Institutions (2)

Microsoft¹, Stanford University²

09 Jun 2003

TL;DR: A new similarity function is proposed which overcomes limitations of commonly used similarity functions, and an efficient fuzzy match algorithm is developed which can effectively clean an incoming tuple if it fails to match exactly with any tuple in the reference relation.

...read moreread less

Abstract: To ensure high data quality, data warehouses must validate and cleanse incoming data tuples from external sources. In many situations, clean tuples must match acceptable tuples in reference tables. For example, product name and description fields in a sales record from a distributor must match the pre-recorded name and description fields in a product reference relation.A significant challenge in such a scenario is to implement an efficient and accurate fuzzy match operation that can effectively clean an incoming tuple if it fails to match exactly with any tuple in the reference relation. In this paper, we propose a new similarity function which overcomes limitations of commonly used similarity functions, and develop an efficient fuzzy match algorithm. We demonstrate the effectiveness of our techniques by evaluating them on real datasets.

...read moreread less

548 citations

Journal Article•DOI•

Open information extraction from the web

[...]

Oren Etzioni¹, Michele Banko¹, Stephen Soderland¹, Daniel S. Weld¹•Institutions (1)

University of Washington¹

01 Dec 2008-Communications of The ACM

TL;DR: In this paper, a self-supervised learner employs a parser and heuristics to determine criteria that will be used by an extraction classifier (or other ranking model) for evaluating the trustworthiness of candidate tuples that have been extracted from the corpus of text.

...read moreread less

Abstract: To implement open information extraction, a new extraction paradigm has been developed in which a system makes a single data-driven pass over a corpus of text, extracting a large set of relational tuples without requiring any human input. Using training data, a Self-Supervised Learner employs a parser and heuristics to determine criteria that will be used by an extraction classifier (or other ranking model) for evaluating the trustworthiness of candidate tuples that have been extracted from the corpus of text, by applying heuristics to the corpus of text. The classifier retains tuples with a sufficiently high probability of being trustworthy. A redundancy-based assessor assigns a probability to each retained tuple to indicate a likelihood that the retained tuple is an actual instance of a relationship between a plurality of objects comprising the retained tuple. The retained tuples comprise an extraction graph that can be queried for information.

...read moreread less

545 citations

Collapse

Network Information

Performance

Metrics

7,188

Papers

157,520

Citations

No. of papers in the topic in previous years
Year	Papers
2023	203
2022	459
2021	210
2020	285
2019	306
2018	266

Tuple

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics