ź-nets and simplex range queries

doi:10.1007/BF02187876

Home
/
Papers
/
ź-nets and simplex range queries

Journal Article•DOI•

ź-nets and simplex range queries

David Haussler¹, Emo Welzl²•Institutions (2)

University of California, Santa Cruz¹, University of Graz²

01 Dec 1987-Discrete and Computational Geometry (Springer New York)-Vol. 2, Iss: 1, pp 127-151

TL;DR: The concept of an ɛ-net of a set of points for an abstract set of ranges is introduced and sufficient conditions that a random sample is an Ã‚-net with any desired probability are given.

read less

Abstract: We demonstrate the existence of data structures for half-space and simplex range queries on finite point sets ind-dimensional space,dÂ?2, with linear storage andO(nÂ?) query time, $$\alpha = \frac{{d(d - 1)}}{{d(d - 1) + 1}} + \gamma for all \gamma > 0$$ . These bounds are better than those previously published for alldÂ?2. Based on ideas due to Vapnik and Chervonenkis, we introduce the concept of an Â?-net of a set of points for an abstract set of ranges and give sufficient conditions that a random sample is an Â?-net with any desired probability. Using these results, we demonstrate how random samples can be used to build a partition-tree structure that achieves the above query time.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

On the VC-dimension of half-spaces with respect to convex sets

[...]

Nicolas Grelier, Saeed Gh. Ilchi, Tillmann Miltzow, Shakhar Smorodinsky

19 Aug 2021-Discrete Mathematics & Theoretical Computer Science

TL;DR: A quadratic lower bound in the number of pairs of intersecting sets in a shattered family of convex sets in the plane is provided, and it is shown that the VC-dimension is unbounded for pairwise disjoint conveX sets in R^d, for d > 2.

...read moreread less

Abstract: A family S of convex sets in the plane defines a hypergraph H = (S, E) as follows. Every subfamily S' of S defines a hyperedge of H if and only if there exists a halfspace h that fully contains S' , and no other set of S is fully contained in h. In this case, we say that h realizes S'. We say a set S is shattered, if all its subsets are realized. The VC-dimension of a hypergraph H is the size of the largest shattered set. We show that the VC-dimension for pairwise disjoint convex sets in the plane is bounded by 3, and this is tight. In contrast, we show the VC-dimension of convex sets in the plane (not necessarily disjoint) is unbounded. We provide a quadratic lower bound in the number of pairs of intersecting sets in a shattered family of convex sets in the plane. We also show that the VC-dimension is unbounded for pairwise disjoint convex sets in R^d , for d > 2. We focus on, possibly intersecting, segments in the plane and determine that the VC-dimension is always at most 5. And this is tight, as we construct a set of five segments that can be shattered. We give two exemplary applications. One for a geometric set cover problem and one for a range-query data structure problem, to motivate our findings.

...read moreread less

1 citations

Posted Content•

RRR: Rank-Regret Representative

[...]

Abolfazl Asudeh¹, Azade Nazi², Nan Zhang³, Gautam Das⁴, H. V. Jagadish¹ - Show less +1 more•Institutions (4)

University of Michigan¹, Google², Pennsylvania State University³, University of Texas at Arlington⁴

28 Feb 2018-arXiv: Databases

TL;DR: In this paper, the authors define regret as the loss in score by limiting consideration to the representative instead of the full data set, for any chosen ranking function, and propose the rank-regret representative as the minimal subset of the data containing at least one of the top-k$ of any possible ranking function.

...read moreread less

Abstract: Selecting the best items in a dataset is a common task in data exploration. However, the concept of "best" lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. Nevertheless, one can remove "dominated" items and create a "representative" subset of the data set, comprising the "best items" in it. A Pareto-optimal representative is guaranteed to contain the best item of each possible ranking, but it can be almost as big as the full data. Representative can be found if we relax the requirement to include the best item for every possible user, and instead just limit the users' "regret". Existing work defines regret as the loss in score by limiting consideration to the representative instead of the full data set, for any chosen ranking function. However, the score is often not a meaningful number and users may not understand its absolute value. Sometimes small ranges in score can include large fractions of the data set. In contrast, users do understand the notion of rank ordering. Therefore, alternatively, we consider the position of the items in the ranked list for defining the regret and propose the {\em rank-regret representative} as the minimal subset of the data containing at least one of the top-$k$ of any possible ranking function. This problem is NP-complete. We use the geometric interpretation of items to bound their ranks on ranges of functions and to utilize combinatorial geometry notions for developing effective and efficient approximation algorithms for the problem. Experiments on real datasets demonstrate that we can efficiently find small subsets with small rank-regrets.

...read moreread less

1 citations

Journal Article•DOI•

Efficient data dissemination using locale covers

[...]

Sandeep Gupta¹, Jinfeng Ni¹, Chinya V. Ravishankar¹•Institutions (1)

University of California, Riverside¹

01 Apr 2008-Pervasive and Mobile Computing

TL;DR: It is shown that location-dependent queries may be answered satisfactorily using locale covers, and two important results are proved: one regarding the greedy algorithm for sensor covers and the other pertaining to randomized locale covers for k-nearest neighbor queries.

...read moreread less

1 citations

Journal Article•

QPTAS for Weighted Geometric Set Cover on Pseudodisks and Halfspaces

[...]

Nabil H. Mustafa, Rajiv Raman¹, Saurabh Ray²•Institutions (2)

Indian Institute of Technology Delhi¹, New York University²

01 Jan 2015-SIAM Journal on Computing

TL;DR: Recently, Adamaszek et al. as mentioned in this paper presented a QPTAS for weighted geometric set-cover problems in R 3, which is based on the separator framework of Wiese et al., and showed that these problems are APX-hard, assuming NP DTIME(2 polylog(n)).

...read moreread less

Abstract: Weighted geometric set-cover problems arise naturally in several geometric and non-geometric settings (eg the breakthrough of Bansal and Pruhs (FOCS 2010) reduces a wide class of machine scheduling problems to weighted geometric set-cover) More than two decades of research has succeeded in settling the (1 + status for most geometric set-cover problems, except for some basic scenarios which are still lacking One is that of weighted disks in the plane for which, after a series of papers, Varadarajan (STOC 2010) presented a clever quasi-sampling technique, which together with improvements by Chan et al (SODA 2012), yielded an O(1)-approximation algorithm Even for the unweighted case, a PTAS for a fundamental class of objects called pseudodisks (which includes half-spaces, disks, unit-height rectangles, translates of convex sets etc) is currently unknown Another fundamental case is weighted halfspaces in R 3 , for which a PTAS is currently lacking In this paper, we present a QPTAS for all of these remaining problems Our results are based on the separator framework of Adamaszek and Wiese (FOCS 2013, SODA 2014), who recently obtained a QPTAS for weighted independent set of polygonal regions This rules out the possibility that these problems are APX-hard, assuming NP DTIME(2 polylog(n)) Together with the recent work of Chan and Grant (CGTA 2014), this settles the APX-hardness status for all natural geometric set-cover problems

...read moreread less

1 citations

Journal Article•DOI•

On Finding Rank Regret Representatives

[...]

Abolfazl Asudeh, Gautam Das, H. V. Jagadish, Shangqi Lu, Azade Nazi, Yufei Tao, Ning Zhang, Jianwen Zhao - Show less +4 more

18 Aug 2022-ACM Transactions on Database Systems

TL;DR: The rank-regret representative is proposed as the minimal subset of the data containing at least one of the top-k of any possible ranking function, which is polynomial time solvable in two-dimensional space but is NP-hard on three or more dimensions.

...read moreread less

Abstract: Selecting the best items in a dataset is a common task in data exploration. However, the concept of “best” lies in the eyes of the beholder: Different users may consider different attributes more important and, hence, arrive at different rankings. Nevertheless, one can remove “dominated” items and create a “representative” subset of the data, comprising the “best items” in it. A Pareto-optimal representative is guaranteed to contain the best item of each possible ranking, but it can be a large portion of data. A much smaller representative can be found if we relax the requirement of including the best item for each user and instead just limit the users’ “regret.” Existing work defines regret as the loss in score by limiting consideration to the representative instead of the full dataset, for any chosen ranking function. However, the score is often not a meaningful number, and users may not understand its absolute value. Sometimes small ranges in score can include large fractions of the dataset. In contrast, users do understand the notion of rank ordering. Therefore, we consider items’ positions in the ranked list in defining the regret and propose the rank-regret representative as the minimal subset of the data containing at least one of the top-k of any possible ranking function. This problem is polynomial time solvable in two-dimensional space but is NP-hard on three or more dimensions. We design a suite of algorithms to fulfill different purposes, such as whether relaxation is permitted on k, the result size, or both, whether a distribution is known, whether theoretical guarantees or practical efficiency is important, and so on. Experiments on real datasets demonstrate that we can efficiently find small subsets with small rank-regrets.

...read moreread less

1 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
…
132
133
134
135
136
137
138
…
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160

Collapse

References

PDF

Open Access

More filters

Book Chapter•DOI•

On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities

[...]

Vladimir Vapnik, A. Ya. Chervonenkis

01 Jan 1971-Theory of Probability and Its Applications

TL;DR: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady.

...read moreread less

Abstract: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady. The paper was first published in Russian as Вапник В. Н. and Червоненкис А. Я. О равномерноЙ сходимости частот появления событиЙ к их вероятностям. Теория вероятностеЙ и ее применения 16(2), 264–279 (1971).

...read moreread less

3,939 citations

"ź-nets and simplex range queries" refers background or methods or result in this paper

...The drawback is that the constants, if deri~,ed from the results in [ 17 ], can be quite large....
[...]
...More generally, we characterize the classes of ranges for which there exists a function f(E) for e S0 such that any finite point set A has an e-net of size f(e), independently of the size of A. These are precisely the classes of ranges with finite Vapnik-Chervonenkis dimension, known as Vapnik-Chervonenkis classes [ 17 ], [9], [19], [1]....
[...]
...The key concepts and proof techniques of this section are based on the pioneering work of Vapnik and Chervonenkis [ 17 ]....
[...]
...Example 5. Let A be a set of n points in E 2. Since the dimension of (E 2, H~-) is 2, the results in [ 17, Theorem 2 ] show that there exists a 0.01-approximation V of A for positive half-planes (and thus for all half-planes) with I VI = 2,525,039....
[...]
...Using the related notion of an e-approxirnation (directly from [ 17 ]), we also point out trivial data structures of constant size that give approximate solutions to the counting problem for halfspaces in constant time (compare [13])....
[...]

Book•

Algorithms in Combinatorial Geometry

[...]

Herbert Edelsbrunner¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Jan 1987

TL;DR: This book offers a modern approach to computational geo- metry, an area thatstudies the computational complexity of geometric problems with an important role in this study.

...read moreread less

Abstract: This book offers a modern approach to computational geo- metry, an area thatstudies the computational complexity of geometric problems. Combinatorial investigations play an important role in this study.

...read moreread less

2,284 citations

"ź-nets and simplex range queries" refers background in this paper

...We conclude this section by examining the relationship between the notion of an e-net and the established notion of a centerpoint [21], [11] in combinatorial geometry....
[...]
..., [11] for a general treatment of arrangements....
[...]

Journal Article•DOI•

On the density of families of sets

[...]

Norbert Sauer¹•Institutions (1)

University of Calgary¹

01 Jul 1972-Journal of Combinatorial Theory, Series A

TL;DR: This paper will answer the question in the affirmative by determining the exact upper bound of T if T is a family of subsets of some infinite set S then either there exists to each number n a set A ⊂ S with |A| = n such that |T ∩ A| = 2n or there exists some number N such that •A| c for each A⩾ N and some constant c.

...read moreread less

1,029 citations

"ź-nets and simplex range queries" refers background in this paper

...Now the assertion can be seen as the dual formulation of Caratheodry's theorem (see [ 15 ], Theorem 2.3.5), which states that if a point x is in the convex hull of a set A in E d, then there exists a subset A' of A such that JA'I -< d + 1 and x is in the convex hull of A'. []...
[...]

Journal Article•DOI•

Central Limit Theorems for Empirical Measures

[...]

Richard M. Dudley

01 Dec 1978-Annals of Probability

TL;DR: In this article, the convergence of a stochastic process indexed by a Gaussian process to a certain Gaussian processes indexed by the supremum norm was studied in a Donsker class.

...read moreread less

Abstract: Let $(X, \mathscr{A}, P)$ be a probability space. Let $X_1, X_2,\cdots,$ be independent $X$-valued random variables with distribution $P$. Let $P_n := n^{-1}(\delta_{X_1} + \cdots + \delta_{X_n})$ be the empirical measure and let $ u_n := n^\frac{1}{2}(P_n - P)$. Given a class $\mathscr{C} \subset \mathscr{a}$, we study the convergence in law of $ u_n$, as a stochastic process indexed by $\mathscr{C}$, to a certain Gaussian process indexed by $\mathscr{C}$. If convergence holds with respect to the supremum norm $\sup_{C \in \mathscr{C}}|f(C)|$, in a suitable (usually nonseparable) function space, we call $\mathscr{C}$ a Donsker class. For measurability, $X$ may be a complete separable metric space, $\mathscr{a} =$ Borel sets, and $\mathscr{C}$ a suitable collection of closed sets or open sets. Then for the Donsker property it suffices that for some $m$, and every set $F \subset X$ with $m$ elements, $\mathscr{C}$ does not cut all subsets of $F$ (Vapnik-Cervonenkis classes). Another sufficient condition is based on metric entropy with inclusion. If $\mathscr{C}$ is a sequence $\{C_m\}$ independent for $P$, then $\mathscr{C}$ is a Donsker class if and only if for some $r, \sigma_m(P(C_m)(1 - P(C_m)))^r < \infty$.

...read moreread less

555 citations

Journal Article•DOI•

The power of geometric duality

[...]

Bernard Chazelle¹, Leonidas J. Guibas², Der-Tsai Lee³•Institutions (3)

Brown University¹, PARC², Northwestern University³

01 Jun 1985-Bit Numerical Mathematics

TL;DR: A new formulation of the notion of duality that allows the unified treatment of a number of geometric problems is used, to solve two long-standing problems of computational geometry and to obtain a quadratic algorithm for computing the minimum-area triangle with vertices chosen amongn points in the plane.

...read moreread less

Abstract: This paper uses a new formulation of the notion of duality that allows the unified treatment of a number of geometric problems. In particular, we are able to apply our approach to solve two long-standing problems of computational geometry: one is to obtain a quadratic algorithm for computing the minimum-area triangle with vertices chosen amongn points in the plane; the other is to produce an optimal algorithm for the half-plane range query problem. This problem is to preprocessn points in the plane, so that given a test half-plane, one can efficiently determine all points lying in the half-plane. We describe an optimalO(k + logn) time algorithm for answering such queries, wherek is the number of points to be reported. The algorithm requiresO(n) space andO(n logn) preprocessing time. Both of these results represent significant improvements over the best methods previously known. In addition, we give a number of new combinatorial results related to the computation of line arrangements.

...read moreread less

286 citations

"ź-nets and simplex range queries" refers methods in this paper

...It should be noted that better bounds are possible for reporting in two dimensions (specifically O(log n + t) time, where t is the number of points reported [3]), but these techniques only work for half-planes....
[...]