scispace - formally typeset
Search or ask a question
Journal ArticleDOI

ź-nets and simplex range queries

01 Dec 1987-Discrete and Computational Geometry (Springer New York)-Vol. 2, Iss: 1, pp 127-151
TL;DR: The concept of an ɛ-net of a set of points for an abstract set of ranges is introduced and sufficient conditions that a random sample is an Â-net with any desired probability are given.
Abstract: We demonstrate the existence of data structures for half-space and simplex range queries on finite point sets ind-dimensional space,dÂ?2, with linear storage andO(nÂ?) query time, $$\alpha = \frac{{d(d - 1)}}{{d(d - 1) + 1}} + \gamma for all \gamma > 0$$ . These bounds are better than those previously published for alldÂ?2. Based on ideas due to Vapnik and Chervonenkis, we introduce the concept of an Â?-net of a set of points for an abstract set of ranges and give sufficient conditions that a random sample is an Â?-net with any desired probability. Using these results, we demonstrate how random samples can be used to build a partition-tree structure that achieves the above query time.

Content maybe subject to copyright    Report

Citations
More filters
01 Jan 1994
TL;DR: Pach's number: [098] Reference DCG-CHAPTER-2008-012 Record created on 2008-11-18, modified on 2017-05-12.
Abstract: Note: Professor Pach's number: [098] Reference DCG-CHAPTER-2008-012 Record created on 2008-11-18, modified on 2017-05-12

44 citations

Journal ArticleDOI
TL;DR: It is proved that any data structure of that form must occupy storage Ω(n d(1 − δ)− e ) , for any fixed e > 0, and the lower bound is tight within a factor of ne.
Abstract: We give a lower bound on the following problem, known as simplex range reporting: Given a collection P of n points in d-space and an arbitrary simplex q, find all the points in P ∩ q. It is understood that P is fixed and can be preprocessed ahead of time, while q is a query that must be answered on-line. We consider data structures for this problem that can be modeled on a pointer machine and whose query time is bounded by O(nδ + r), where r is the number of points to be reported and δ is an arbitrary fixed real. We prove that any such data structure of that form must occupy storage Ω(n d(1 − δ)− e ) , for any fixed e > 0. This lower bound is tight within a factor of ne.

44 citations


Cites background from "ź-nets and simplex range queries"

  • ...This problem, known as simplex range searching, has been extensively studied in recent years [4, 5, 6, 8, 9, 11, 14, 15, 17, 19, 20]....

    [...]

01 Jan 2012
TL;DR: In this article, it was shown that large numbers of such training samples can guarantee that the learned classifier preforms just as well as one learned from target generated samples, under some assumptions about the relationship between the training and the target data distributions.
Abstract: The Domain Adaptation problem in machine learning occurs when the distribution generating the test data differs from the one that generates the training data. A common approach to this issue is to train a standard learner for the learning task with the available training sample (generated by a distribution that is different from the test distribution). One can view such learning as learning from a not-perfectly-representative training sample. The question we focus on is under which circumstances large sizes of such training samples can guarantee that the learned classifier preforms just as well as one learned from target generated samples. In other words, are there circumstances in which quantity can compensate for quality (of the training data)? We give a positive answer, showing that this is possible when using a Nearest Neighbor algorithm. We show this under some assumptions about the relationship between the training and the target data distributions (the assumptions of covariate shift as well as a bound on the ratio of certain probability weights between the source (training) and target (test) distribution). We further show that in a slightly different learning model, when one imposes restrictions on the nature of the learned classifier, these assumptions are not always sufficient to allow such a replacement of the training sample: For proper learning, where the output classifier has to come from a predefined class, we prove that any learner needs access to data generated from the target distribution.

43 citations

Journal ArticleDOI
TL;DR: The correspondence to arrangements is obtained indirectly via a new characterization of uniforom oriented matroids: a range space (X, ℛ) naturally corresponds to a uniform oriented matroid of rank |X|—d if and only if its VC-dimension impliesX - R∈ℛ, and |ℚ| is maximum under these conditions.
Abstract: An arrangement of oriented pseudohyperplanes in affined-space defines on its setX of pseudohyperplanes a set system (or range space) (X, ?), ? ? 2x of VC-dimensiond in a natural way: to every cellc in the arrangement assign the subset of pseudohyperplanes havingc on their positive side, and let ? be the collection of all these subsets. We investigate and characterize the range spaces corresponding tosimple arrangements of pseudohyperplanes in this way; such range spaces are calledpseudogeometric, and they have the property that the cardinality of ? is maximum for the given VC-dimension. In general, such range spaces are calledmaximum, and we show that the number of rangesR?? for whichX - R?? also, determines whether a maximum range space is pseudogeometric. Two other characterizations go via a simple duality concept and "small" subspaces. The correspondence to arrangements is obtained indirectly via a new characterization of uniforom oriented matroids: a range space (X, ?) naturally corresponds to a uniform oriented matroid of rank |X|--d if and only if its VC-dimension isd,R?? impliesX - R??, and |?| is maximum under these conditions.

43 citations

Proceedings ArticleDOI
06 Jun 2005
TL;DR: The study of exact geometric algorithms that require limited storage and make only a small number of passes over the input is initiated.
Abstract: We initiate the study of exact geometric algorithms that require limited storage and make only a small number of passes over the input. Fundamental problems such as low-dimensional linear programming and convex hulls are considered.

43 citations

References
More filters
Book ChapterDOI
TL;DR: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady.
Abstract: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady. The paper was first published in Russian as Вапник В. Н. and Червоненкис А. Я. О равномерноЙ сходимости частот появления событиЙ к их вероятностям. Теория вероятностеЙ и ее применения 16(2), 264–279 (1971).

3,939 citations


"ź-nets and simplex range queries" refers background or methods or result in this paper

  • ...The drawback is that the constants, if deri~,ed from the results in [ 17 ], can be quite large....

    [...]

  • ...More generally, we characterize the classes of ranges for which there exists a function f(E) for e S0 such that any finite point set A has an e-net of size f(e), independently of the size of A. These are precisely the classes of ranges with finite Vapnik-Chervonenkis dimension, known as Vapnik-Chervonenkis classes [ 17 ], [9], [19], [1]....

    [...]

  • ...The key concepts and proof techniques of this section are based on the pioneering work of Vapnik and Chervonenkis [ 17 ]....

    [...]

  • ...Example 5. Let A be a set of n points in E 2. Since the dimension of (E 2, H~-) is 2, the results in [ 17, Theorem 2 ] show that there exists a 0.01-approximation V of A for positive half-planes (and thus for all half-planes) with I VI = 2,525,039....

    [...]

  • ...Using the related notion of an e-approxirnation (directly from [ 17 ]), we also point out trivial data structures of constant size that give approximate solutions to the counting problem for halfspaces in constant time (compare [13])....

    [...]

Book
01 Jan 1987
TL;DR: This book offers a modern approach to computational geo- metry, an area thatstudies the computational complexity of geometric problems with an important role in this study.
Abstract: This book offers a modern approach to computational geo- metry, an area thatstudies the computational complexity of geometric problems. Combinatorial investigations play an important role in this study.

2,284 citations


"ź-nets and simplex range queries" refers background in this paper

  • ...We conclude this section by examining the relationship between the notion of an e-net and the established notion of a centerpoint [21], [11] in combinatorial geometry....

    [...]

  • ..., [11] for a general treatment of arrangements....

    [...]

Journal ArticleDOI
TL;DR: This paper will answer the question in the affirmative by determining the exact upper bound of T if T is a family of subsets of some infinite set S then either there exists to each number n a set A ⊂ S with |A| = n such that |T ∩ A| = 2n or there exists some number N such that •A| c for each A⩾ N and some constant c.

1,029 citations


"ź-nets and simplex range queries" refers background in this paper

  • ...Now the assertion can be seen as the dual formulation of Caratheodry's theorem (see [ 15 ], Theorem 2.3.5), which states that if a point x is in the convex hull of a set A in E d, then there exists a subset A' of A such that JA'I -< d + 1 and x is in the convex hull of A'. []...

    [...]

Journal ArticleDOI
TL;DR: In this article, the convergence of a stochastic process indexed by a Gaussian process to a certain Gaussian processes indexed by the supremum norm was studied in a Donsker class.
Abstract: Let $(X, \mathscr{A}, P)$ be a probability space. Let $X_1, X_2,\cdots,$ be independent $X$-valued random variables with distribution $P$. Let $P_n := n^{-1}(\delta_{X_1} + \cdots + \delta_{X_n})$ be the empirical measure and let $ u_n := n^\frac{1}{2}(P_n - P)$. Given a class $\mathscr{C} \subset \mathscr{a}$, we study the convergence in law of $ u_n$, as a stochastic process indexed by $\mathscr{C}$, to a certain Gaussian process indexed by $\mathscr{C}$. If convergence holds with respect to the supremum norm $\sup_{C \in \mathscr{C}}|f(C)|$, in a suitable (usually nonseparable) function space, we call $\mathscr{C}$ a Donsker class. For measurability, $X$ may be a complete separable metric space, $\mathscr{a} =$ Borel sets, and $\mathscr{C}$ a suitable collection of closed sets or open sets. Then for the Donsker property it suffices that for some $m$, and every set $F \subset X$ with $m$ elements, $\mathscr{C}$ does not cut all subsets of $F$ (Vapnik-Cervonenkis classes). Another sufficient condition is based on metric entropy with inclusion. If $\mathscr{C}$ is a sequence $\{C_m\}$ independent for $P$, then $\mathscr{C}$ is a Donsker class if and only if for some $r, \sigma_m(P(C_m)(1 - P(C_m)))^r < \infty$.

555 citations

Journal ArticleDOI
TL;DR: A new formulation of the notion of duality that allows the unified treatment of a number of geometric problems is used, to solve two long-standing problems of computational geometry and to obtain a quadratic algorithm for computing the minimum-area triangle with vertices chosen amongn points in the plane.
Abstract: This paper uses a new formulation of the notion of duality that allows the unified treatment of a number of geometric problems. In particular, we are able to apply our approach to solve two long-standing problems of computational geometry: one is to obtain a quadratic algorithm for computing the minimum-area triangle with vertices chosen amongn points in the plane; the other is to produce an optimal algorithm for the half-plane range query problem. This problem is to preprocessn points in the plane, so that given a test half-plane, one can efficiently determine all points lying in the half-plane. We describe an optimalO(k + logn) time algorithm for answering such queries, wherek is the number of points to be reported. The algorithm requiresO(n) space andO(n logn) preprocessing time. Both of these results represent significant improvements over the best methods previously known. In addition, we give a number of new combinatorial results related to the computation of line arrangements.

286 citations


"ź-nets and simplex range queries" refers methods in this paper

  • ...It should be noted that better bounds are possible for reporting in two dimensions (specifically O(log n + t) time, where t is the number of points reported [3]), but these techniques only work for half-planes....

    [...]