ź-nets and simplex range queries

doi:10.1007/BF02187876

Home
/
Papers
/
ź-nets and simplex range queries

Journal Article•DOI•

ź-nets and simplex range queries

David Haussler¹, Emo Welzl²•Institutions (2)

University of California, Santa Cruz¹, University of Graz²

01 Dec 1987-Discrete and Computational Geometry (Springer New York)-Vol. 2, Iss: 1, pp 127-151

TL;DR: The concept of an ɛ-net of a set of points for an abstract set of ranges is introduced and sufficient conditions that a random sample is an Ã‚-net with any desired probability are given.

read less

Abstract: We demonstrate the existence of data structures for half-space and simplex range queries on finite point sets ind-dimensional space,dÂ?2, with linear storage andO(nÂ?) query time, $$\alpha = \frac{{d(d - 1)}}{{d(d - 1) + 1}} + \gamma for all \gamma > 0$$ . These bounds are better than those previously published for alldÂ?2. Based on ideas due to Vapnik and Chervonenkis, we introduce the concept of an Â?-net of a set of points for an abstract set of ranges and give sufficient conditions that a random sample is an Â?-net with any desired probability. Using these results, we demonstrate how random samples can be used to build a partition-tree structure that achieves the above query time.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Posted Content•

Random hyperplane search trees in high dimensions

[...]

Luc Devroye¹, James King²•Institutions (2)

McGill University¹, University of Oxford²

02 Jun 2011-arXiv: Computational Geometry

TL;DR: For any fixed dimension d, a random hyperplane search tree with height at most (1 + O(1/sqrt(d)) log 2 n and average element depth at most 2 n with high probability as n \rightarrow \infty was shown in this article.

...read moreread less

Abstract: Given a set S of n \geq d points in general position in R^d, a random hyperplane split is obtained by sampling d points uniformly at random without replacement from S and splitting based on their affine hull A random hyperplane search tree is a binary space partition tree obtained by recursive application of random hyperplane splits We investigate the structural distributions of such random trees with a particular focus on the growth with d A blessing of dimensionality arises--as d increases, random hyperplane splits more closely resemble perfectly balanced splits; in turn, random hyperplane search trees more closely resemble perfectly balanced binary search trees We prove that, for any fixed dimension d, a random hyperplane search tree storing n points has height at most (1 + O(1/sqrt(d))) log_2 n and average element depth at most (1 + O(1/d)) log_2 n with high probability as n \rightarrow \infty Further, we show that these bounds are asymptotically optimal with respect to d

...read moreread less

Posted Content•

Bounding the trace function of a hypergraph with applications

[...]

Farhad Shahrokhi¹•Institutions (1)

University of North Texas¹

25 Jul 2020-arXiv: Combinatorics

TL;DR: An upper bound on the trace function of a hypergraph H is derived and its applications are demonstrated, including a new upper bound for the VC dimension of H that can be used to compute $vc(H)$ in polynomial time provided that $H$ has bounded degeneracy.

...read moreread less

Abstract: An upper bound on the trace function of a hypergraph $H$ is derived and its applications are demonstrated. For instance, a new upper bound for the VC dimension of $H$, or $vc(H)$, follows as a consequence and can be used to compute $vc(H)$ in polynomial time provided that $H$ has bounded degeneracy. This was not previously known. Particularly, when $H$ is a hypergraph arising from closed neighborhoods of a graph, this approach asymptotically improves the time complexity of the previous result for computing $vc(H)$. Another consequence is a general lower bound on the {\it distinguishing transversal number } of $H$ that gives rise to applications in domination theory of graphs. To effectively apply the methods developed here, one needs to have good estimations of degeneracy, and its variation or reduced degeneracy which is introduced here.

...read moreread less

Posted Content•

Danzer's Problem, Effective Constructions of Dense Forests and Digital Sequences.

[...]

Ioannis Tsokanos

04 Nov 2021-arXiv: Number Theory

TL;DR: In this article, the best known visibility bound for dense and optical forests is obtained by constructing a deterministic digital sequence satisfying strong dispersion properties. But this is not the case for planar dense forests.

...read moreread less

Abstract: A 1965 problem due to Danzer asks whether there exists a set in Euclidean space with finite density intersecting any convex body of volume one. A recent approach to this problem is concerned with the construction of dense forests and is obtained by a suitable weakening of the volume constraint. A dense forest is a discrete point set of finite density getting uniformly close to long enough line segments. The distribution of points in a dense forest is then quantified in terms of a visibility function. Another way to weaken the assumptions in Danzer's problem is by relaxing the density constraint. In this respect, a new concept is introduced in this paper, namely that of an optical forest. An optical forest in $\mathbb{R}^{d}$ is a point set with optimal visibility but not necessarily with finite density. In the literature, the best constructions of Danzer sets and dense forests lack effectivity. The goal of this paper is to provide constructions of dense and optical forests which yield the best known results in any dimension $d \ge 2$ both in terms of visibility and density bounds and effectiveness. Namely, there are three main results in this work: (1) the construction of a dense forest with the best known visibility bound which, furthermore, enjoys the property of being deterministic; (2) the deterministic construction of an optical forest with a density failing to be finite only up to a logarithm and (3) the construction of a planar Peres-type forest (that is, a dense forest obtained from a construction due to Peres) with the best known visibility bound. This is achieved by constructing a deterministic digital sequence satisfying strong dispersion properties.

...read moreread less

Posted Content•

Distribution-Sensitive Bounds on Relative Approximations of Geometric Ranges

[...]

Yufei Tao¹, Yu Wang¹•Institutions (1)

The Chinese University of Hong Kong¹

15 Mar 2019-arXiv: Computational Geometry

TL;DR: A more general bound sensitive to the content of $X is shown, which is the first formal justification on why the term $1/\rho$ is not compulsory for "realistic" inputs and constrain $\mathcal{R}$ to be the set of halfspaces in $\mathbb{R]^d$ for a constant $d$.

...read moreread less

Abstract: A family $\mathcal{R}$ of ranges and a set $X$ of points together define a range space $(X, \mathcal{R}|_X)$, where $\mathcal{R}|_X = \{X \cap h \mid h \in \mathcal{R}\}$. We want to find a structure to estimate the quantity $|X \cap h|/|X|$ for any range $h \in \mathcal{R}$ with the $(\rho, \epsilon)$-guarantee: (i) if $|X \cap h|/|X| > \rho$, the estimate must have a relative error $\epsilon$; (ii) otherwise, the estimate must have an absolute error $\rho \epsilon$. The objective is to minimize the size of the structure. Currently, the dominant solution is to compute a relative $(\rho, \epsilon)$-approximation, which is a subset of $X$ with $\tilde{O}(\lambda/(\rho \epsilon^2))$ points, where $\lambda$ is the VC-dimension of $(X, \mathcal{R}|_X)$, and $\tilde{O}$ hides polylog factors. This paper shows a more general bound sensitive to the content of $X$. We give a structure that stores $O(\log (1/\rho))$ integers plus $\tilde{O}(\theta \cdot (\lambda/\epsilon^2))$ points of $X$, where $\theta$ - called the disagreement coefficient - measures how much the ranges differ from each other in their intersections with $X$. The value of $\theta$ is between 1 and $1/\rho$, such that our space bound is never worse than that of relative $(\rho, \epsilon)$-approximations, but we improve the latter's $1/\rho$ term whenever $\theta = o(\frac{1}{\rho \log (1/\rho)})$. We also prove that, in the worst case, summaries with the $(\rho, 1/2)$-guarantee must consume $\Omega(\theta)$ words even for $d = 2$ and $\lambda \le 3$. We then constrain $\mathcal{R}$ to be the set of halfspaces in $\mathbb{R}^d$ for a constant $d$, and prove the existence of structures with $o(1/(\rho \epsilon^2))$ size offering $(\rho,\epsilon)$-guarantees, when $X$ is generated from various stochastic distributions. This is the first formal justification on why the term $1/\rho$ is not compulsory for "realistic" inputs.

...read moreread less

Posted Content•

Approximate Maximum Halfspace Discrepancy

[...]

Michael Matheny¹, Jeff M. Phillips¹•Institutions (1)

University of Utah¹

25 Jun 2021-arXiv: Computational Geometry

TL;DR: In this paper, the authors considered the problem of finding an approximate solution to the maximum discrepancy problem in the geometric range space, where the set of ranges defined by the disjoint union of a red and blue set can be represented as a set of halfspaces.

...read moreread less

Abstract: Consider the geometric range space $(X, \mathcal{H}_d)$ where $X \subset \mathbb{R}^d$ and $\mathcal{H}_d$ is the set of ranges defined by $d$-dimensional halfspaces. In this setting we consider that $X$ is the disjoint union of a red and blue set. For each halfspace $h \in \mathcal{H}_d$ define a function $\Phi(h)$ that measures the "difference" between the fraction of red and fraction of blue points which fall in the range $h$. In this context the maximum discrepancy problem is to find the $h^* = \arg \max_{h \in (X, \mathcal{H}_d)} \Phi(h)$. We aim to instead find an $\hat{h}$ such that $\Phi(h^*) - \Phi(\hat{h}) \le \varepsilon$. This is the central problem in linear classification for machine learning, in spatial scan statistics for spatial anomaly detection, and shows up in many other areas. We provide a solution for this problem in $O(|X| + (1/\varepsilon^d) \log^4 (1/\varepsilon))$ time, which improves polynomially over the previous best solutions. For $d=2$ we show that this is nearly tight through conditional lower bounds. For different classes of $\Phi$ we can either provide a $\Omega(|X|^{3/2 - o(1)})$ time lower bound for the exact solution with a reduction to APSP, or an $\Omega(|X| + 1/\varepsilon^{2-o(1)})$ lower bound for the approximate solution with a reduction to 3SUM. A key technical result is a $\varepsilon$-approximate halfspace range counting data structure of size $O(1/\varepsilon^d)$ with $O(\log (1/\varepsilon))$ query time, which we can build in $O(|X| + (1/\varepsilon^d) \log^4 (1/\varepsilon))$ time.

...read moreread less

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
…
150
151
152
153
154
155
156
…
157
158
159
160

Collapse

References

PDF

Open Access

More filters

Book Chapter•DOI•

On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities

[...]

Vladimir Vapnik, A. Ya. Chervonenkis

01 Jan 1971-Theory of Probability and Its Applications

TL;DR: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady.

...read moreread less

Abstract: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady. The paper was first published in Russian as Вапник В. Н. and Червоненкис А. Я. О равномерноЙ сходимости частот появления событиЙ к их вероятностям. Теория вероятностеЙ и ее применения 16(2), 264–279 (1971).

...read moreread less

3,939 citations

"ź-nets and simplex range queries" refers background or methods or result in this paper

...The drawback is that the constants, if deri~,ed from the results in [ 17 ], can be quite large....
[...]
...More generally, we characterize the classes of ranges for which there exists a function f(E) for e S0 such that any finite point set A has an e-net of size f(e), independently of the size of A. These are precisely the classes of ranges with finite Vapnik-Chervonenkis dimension, known as Vapnik-Chervonenkis classes [ 17 ], [9], [19], [1]....
[...]
...The key concepts and proof techniques of this section are based on the pioneering work of Vapnik and Chervonenkis [ 17 ]....
[...]
...Example 5. Let A be a set of n points in E 2. Since the dimension of (E 2, H~-) is 2, the results in [ 17, Theorem 2 ] show that there exists a 0.01-approximation V of A for positive half-planes (and thus for all half-planes) with I VI = 2,525,039....
[...]
...Using the related notion of an e-approxirnation (directly from [ 17 ]), we also point out trivial data structures of constant size that give approximate solutions to the counting problem for halfspaces in constant time (compare [13])....
[...]

Book•

Algorithms in Combinatorial Geometry

[...]

Herbert Edelsbrunner¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Jan 1987

TL;DR: This book offers a modern approach to computational geo- metry, an area thatstudies the computational complexity of geometric problems with an important role in this study.

...read moreread less

Abstract: This book offers a modern approach to computational geo- metry, an area thatstudies the computational complexity of geometric problems. Combinatorial investigations play an important role in this study.

...read moreread less

2,284 citations

"ź-nets and simplex range queries" refers background in this paper

...We conclude this section by examining the relationship between the notion of an e-net and the established notion of a centerpoint [21], [11] in combinatorial geometry....
[...]
..., [11] for a general treatment of arrangements....
[...]

Journal Article•DOI•

On the density of families of sets

[...]

Norbert Sauer¹•Institutions (1)

University of Calgary¹

01 Jul 1972-Journal of Combinatorial Theory, Series A

TL;DR: This paper will answer the question in the affirmative by determining the exact upper bound of T if T is a family of subsets of some infinite set S then either there exists to each number n a set A ⊂ S with |A| = n such that |T ∩ A| = 2n or there exists some number N such that •A| c for each A⩾ N and some constant c.

...read moreread less

1,029 citations

"ź-nets and simplex range queries" refers background in this paper

...Now the assertion can be seen as the dual formulation of Caratheodry's theorem (see [ 15 ], Theorem 2.3.5), which states that if a point x is in the convex hull of a set A in E d, then there exists a subset A' of A such that JA'I -< d + 1 and x is in the convex hull of A'. []...
[...]

Journal Article•DOI•

Central Limit Theorems for Empirical Measures

[...]

Richard M. Dudley

01 Dec 1978-Annals of Probability

TL;DR: In this article, the convergence of a stochastic process indexed by a Gaussian process to a certain Gaussian processes indexed by the supremum norm was studied in a Donsker class.

...read moreread less

Abstract: Let $(X, \mathscr{A}, P)$ be a probability space. Let $X_1, X_2,\cdots,$ be independent $X$-valued random variables with distribution $P$. Let $P_n := n^{-1}(\delta_{X_1} + \cdots + \delta_{X_n})$ be the empirical measure and let $ u_n := n^\frac{1}{2}(P_n - P)$. Given a class $\mathscr{C} \subset \mathscr{a}$, we study the convergence in law of $ u_n$, as a stochastic process indexed by $\mathscr{C}$, to a certain Gaussian process indexed by $\mathscr{C}$. If convergence holds with respect to the supremum norm $\sup_{C \in \mathscr{C}}|f(C)|$, in a suitable (usually nonseparable) function space, we call $\mathscr{C}$ a Donsker class. For measurability, $X$ may be a complete separable metric space, $\mathscr{a} =$ Borel sets, and $\mathscr{C}$ a suitable collection of closed sets or open sets. Then for the Donsker property it suffices that for some $m$, and every set $F \subset X$ with $m$ elements, $\mathscr{C}$ does not cut all subsets of $F$ (Vapnik-Cervonenkis classes). Another sufficient condition is based on metric entropy with inclusion. If $\mathscr{C}$ is a sequence $\{C_m\}$ independent for $P$, then $\mathscr{C}$ is a Donsker class if and only if for some $r, \sigma_m(P(C_m)(1 - P(C_m)))^r < \infty$.

...read moreread less

555 citations

Journal Article•DOI•

The power of geometric duality

[...]

Bernard Chazelle¹, Leonidas J. Guibas², Der-Tsai Lee³•Institutions (3)

Brown University¹, PARC², Northwestern University³

01 Jun 1985-Bit Numerical Mathematics

TL;DR: A new formulation of the notion of duality that allows the unified treatment of a number of geometric problems is used, to solve two long-standing problems of computational geometry and to obtain a quadratic algorithm for computing the minimum-area triangle with vertices chosen amongn points in the plane.

...read moreread less

Abstract: This paper uses a new formulation of the notion of duality that allows the unified treatment of a number of geometric problems. In particular, we are able to apply our approach to solve two long-standing problems of computational geometry: one is to obtain a quadratic algorithm for computing the minimum-area triangle with vertices chosen amongn points in the plane; the other is to produce an optimal algorithm for the half-plane range query problem. This problem is to preprocessn points in the plane, so that given a test half-plane, one can efficiently determine all points lying in the half-plane. We describe an optimalO(k + logn) time algorithm for answering such queries, wherek is the number of points to be reported. The algorithm requiresO(n) space andO(n logn) preprocessing time. Both of these results represent significant improvements over the best methods previously known. In addition, we give a number of new combinatorial results related to the computation of line arrangements.

...read moreread less

286 citations

"ź-nets and simplex range queries" refers methods in this paper

...It should be noted that better bounds are possible for reporting in two dimensions (specifically O(log n + t) time, where t is the number of points reported [3]), but these techniques only work for half-planes....
[...]