Rapid sampling for visualizations with ordering guarantees
Albert Kim,Eric Blais,Aditya Parameswaran,Piotr Indyk,Samuel Madden,Ronitt Rubinfeld +5 more
- Vol. 8, Iss: 5, pp 521-532
Reads0
Chats0
TLDR
In this article, the authors focus on the problem of rapidly generating approximate visualizations while preserving crucial visual properties of interest to analysts, such as the visual property of ordering, and apply to some other visual properties.Abstract:
Visualizations are frequently used as a means to understand trends and gather insights from datasets, but often take a long time to generate. In this paper, we focus on the problem of rapidly generating approximate visualizations while preserving crucial visual properties of interest to analysts. Our primary focus will be on sampling algorithms that preserve the visual property of ordering; our techniques will also apply to some other visual properties. For instance, our algorithms can be used to generate an approximate visualization of a bar chart very rapidly, where the comparisons between any two bars are correct. We formally show that our sampling algorithms are generally applicable and provably optimal in theory, in that they do not take more samples than necessary to generate the visualizations with ordering guarantees. They also work well in practice, correctly ordering output groups while taking orders of magnitude fewer samples and much less time than conventional sampling schemes.read more
Citations
More filters
Journal ArticleDOI
SeeDB: efficient data-driven visualization recommendations to support visual analytics
TL;DR: This work proposes SeeDB, a visualization recommendation engine to facilitate fast visual analysis: given a subset of data to be studied, SeeDB intelligently explores the space of visualizations, evaluates promising visualizations for trends, and recommends those it deems most “useful” or “interesting”.
Proceedings ArticleDOI
Overview of Data Exploration Techniques
TL;DR: This tutorial surveys recent developments in the emerging area of database systems tailored for data exploration as well as new ideas on how to interact with a data system to enable users and applications to quickly figure out which data parts are of interest.
Proceedings ArticleDOI
Wander Join: Online Aggregation via Random Walks
TL;DR: This paper proposes a new approach, the wander join algorithm, to the online aggregation problem by performing random walks over the underlying join graph, and designs an optimizer that chooses the optimal plan for conducting the random walks without having to collect any statistics a priori.
Proceedings ArticleDOI
Approximate Query Processing: No Silver Bullet
TL;DR: This paper reflects on the state of the art of Approximate Query Processing, and discusses two promising avenues to pursue towards integrating Approximates Query Processing into data platforms.
Proceedings ArticleDOI
Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee
TL;DR: A novel sampling scheme called measure-biased sampling is proposed to address the main challenges to provide rigorous error guarantees and to handle arbitrary highly selective predicates without maintaining large-sized samples and two new indexes to augment in-memory samples are proposed.
References
More filters
Journal ArticleDOI
On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other
Henry B. Mann,D. R. Whitney +1 more
TL;DR: In this paper, the authors show that the limit distribution is normal if n, n$ go to infinity in any arbitrary manner, where n = m = 8 and n = n = 8.
Book ChapterDOI
Probability Inequalities for sums of Bounded Random Variables
TL;DR: In this article, upper bounds for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt are derived for certain sums of dependent random variables such as U statistics.
Book
The Visual Display of Quantitative Information
TL;DR: The visual display of quantitative information is shown in the form of icons and symbols in order to facilitate the interpretation of data.
Journal ArticleDOI