scispace - formally typeset
Open AccessJournal ArticleDOI

Efficient estimation of joint queries from multiple OLAP databases

TLDR
This work proves formally that the partial preaggregation method (PP) yields the same results as the F method, and provides analytical and experimental results on the accuracy and computational benefits of the PP method.
Abstract
Given an OLAP query expressed over multiple source OLAP databases, we study the problem of estimating the resulting OLAP target database. The problem arises when it is not possible to derive the result from a single database. The method we use is linear indirect estimation, commonly used for statistical estimation. We examine two obvious computational methods for computing such a target database, called the full cross-product (F) and preaggregation (P) methods. We study the accuracy and computational cost of these methods. While the F method provides a more accurate estimate, it is more expensive computationally than P. Our contribution is in proposing a third, new method, called the partial preaggregation method (PP), which is significantly less expensive than F, but just as accurate. We prove formally that the PP method yields the same results as the F method, and provide analytical and experimental results on the accuracy and computational benefits of the PP method.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Sampling strategies for information extraction over the deep web

TL;DR: The results of the first large-scale experimental evaluation of sampling techniques for information extraction over the deep web show the merits and limitations of the alternative query execution and document retrieval and processing strategies, and provide a roadmap for addressing this critically important building block for efficient, scalable information extraction.
Journal ArticleDOI

Equivalence of compositional expressions and independence relations in compositional models

TL;DR: The generalization of Jirousek's composition operator is generalized in such a way that it can be applied to distribution functions with values in a "semifield", and (parenthesized) compositional expressions are introduced, which in some sense generalize Jirousesk's "generating sequences" of compositional models.
Journal ArticleDOI

A Join-Like Operator to Combine Data Cubes and Answer Queries from Multiple Data Cubes

TL;DR: It is proved that proxy noncommonality characterizes patterns for which every two merge expressions are equivalent, and an efficient procedure for answering joint queries in the special case of perfect merge expressions is provided.
Journal ArticleDOI

Improving estimation accuracy of aggregate queries on data cubes

TL;DR: In this article, the problem of estimating a target database from summary databases derived from a base data cube is investigated, and it is shown that the primary database with the largest number of cells in common with the target and the proxy database provides the more accurate estimates.
Journal ArticleDOI

Approximate Query Answering Based on Topological Neighborhood and Semantic Similarity in OpenStreetMap

TL;DR: This work proposes the Approximate Answering Engine (AAE) within a Distributed System, referred to as GeoPQLJSON (GeoPQLJ), which provides approximate answers to query with empty results by following two directions: the Operator Conceptual Neighborhood (OCN) graph, and the OpenStreetMap (OSM) attribute hierarchy, giving maximum flexibility to the user choices.
References
More filters
Journal ArticleDOI

Data cube: a relational aggregation operator generalizing GROUP-BY, CROSS-TAB, and SUB-TOTALS

TL;DR: The data cube operator as discussed by the authors generalizes the histogram, cross-tabulation, roll-up, drill-down, and sub-total constructs found in most report writers.
Posted Content

Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

TL;DR: The cube operator as discussed by the authors generalizes the histogram, cross-tabulation, roll-up, drill-down, and sub-total constructs found in most report writers, and treats each of the N aggregation attributes as a dimension of N-space.
Book

Small Area Estimation

TL;DR: In this paper, the authors proposed a model-based approach for estimating small area statistics based on direct and indirect estimates of the total population of a given region in a given domain.
Journal ArticleDOI

Small Area Estimation: An Appraisal

Malay Ghosh, +1 more
- 01 Feb 1994 - 
TL;DR: Empirical best linear unbiased prediction as well as empirical and hierarchical Bayes seem to have a distinct advantage over other methods in small area estimation.
Proceedings ArticleDOI

Modeling multidimensional databases

TL;DR: A data model and a few algebraic operations that provide semantic foundation to multidimensional databases and provide an algebraic application programming interface (API) that allows the separation of the front end from the back end are proposed.
Related Papers (5)