scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Learnability and the Vapnik-Chervonenkis dimension

TL;DR: This paper shows that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.
Abstract: Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space En. The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufficient conditions are provided for feasible learnability.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
01 Nov 2006
TL;DR: A way of measuring the similarity of a Boolean vector to a given set of Boolean vectors is proposed, motivated in part by certain data mining or machine learning problems.
Abstract: We propose a way of measuring the similarity of a Boolean vector to a given set of Boolean vectors, motivated in part by certain data mining or machine learning problems. We relate the similarity measure to one based on Hamming distance and we develop from this some ways of quantifying the 'quality' of a dataset.

19 citations


Additional excerpts

  • ...…August 2004 aDepartment of Mathematics, London School of Economics, Houghton Street, London WC2A 2AE, United Kingdom, m.anthony@lse.ac.uk bRUTCOR, Rutgers University, 640 Bartholomew Road, Piscataway, NJ 08854-8003, USA, hammer@rutcor.rutgers.edu Rutcor Research Report RRR 27-2004, August 2004...

    [...]

Journal ArticleDOI
01 Apr 2000
TL;DR: In this article, a new algorithm for learning one-variable pattern languages from positive data is proposed and analyzed with respect to its average-case behavior, which converges within an expected constant number of rounds and a total learning time that is linear in the pattern length.
Abstract: A new algorithm for learning one-variable pattern languages from positive data is proposed and analyzed with respect to its average-case behavior. We consider the total learning time that takes into account all operations till convergence to a correct hypothesis is achieved. For almost all meaningful distributions defining how the pattern variable is replaced by a string to generate random examples of the target pattern language, it is shown that this algorithm converges within an expected constant number of rounds and a total learning time that is linear in the pattern length. Thus, our solution is average-case optimal in a strong sense. Though one-variable pattern languages can neither be finitely inferred from positive data nor PAC-learned, our approach can be extended to a probabilistic finite learner that exactly infers all one-variable pattern languages from positive data with high confidence. It is a long-standing open problem whether or not pattern languages can be learned in cases that empty substitutions for the pattern variables are also allowed. Our learning strategy can be generalized to this situation as well. Finally, we show some experimental results for the behavior of this new learning algorithm in practice.

19 citations

Proceedings ArticleDOI
15 Aug 1991
TL;DR: This work formally defines a distribution dependent notion of algorithmic capacity (which is related to the distibution free notion of the VC dimension) and provides estimates of the capacity of the proposed algorithms.
Abstract: We investigate algorithms for learning binary weights from examples of majority functions of a set of literals. In particular, given a set of (randomly drawn) input-output pairs, with inputs being binary ±1 vectors, and the outputs likewise being ±1 classifications, we seek to find a vector of binary (±1) weights for a linear threshold element (or formal neuron) which provides a linearly separable hypothesis consistent on the set of examples. We present three algorithms–Directed Drift, Harmonic Update, and Majority Rule–for learning binary weights in this context, and examine their characteristics. In particular, we formally define a distribution dependent notion of algorithmic capacity (which is related to the distibution free notion of the VC dimension) and provide estimates of the capacity of the proposed algorithms.

19 citations

Journal Article
TL;DR: A combinatorial characterization of the sample size sufficient and necessary to learn a class of concepts under pure differential privacy is given and a similar characterization holds for the database size needed for computing a large class of optimization problems underpure differential privacy, and also for the well studied problem of private data release.
Abstract: Kasiviswanathan et al. (FOCS 2008) defined private learning as a combination of PAC learning and differential privacy. Informally, a private learner is applied to a collection of labeled individual information and outputs a hypothesis while preserving the privacy of each individual. Kasiviswanathan et al. left open the question of characterizing the sample complexity of private learners. We give a combinatorial characterization of the sample size sufficient and necessary to learn a class of concepts under pure differential privacy. This characterization is analogous to the well known characterization of the sample complexity of non-private learning in terms of the VC dimension of the concept class. We introduce the notion of probabilistic representation of a concept class, and our new complexity measure RepDim corresponds to the size of the smallest probabilistic representation of the concept class. We show that any private learning algorithm for a concept class C with sample complexity m implies RepDim(C) = O(m), and that there exists a private learning algorithm with sample complexity m = O(RepDim(C)). We further demonstrate that a similar characterization holds for the database size needed for computing a large class of optimization problems under pure differential privacy, and also for the well studied problem of private data release.

19 citations


Cites background from "Learnability and the Vapnik-Chervon..."

  • ...Without privacy, it is well-known that the sample complexity of PAC learning is proportional to the Vapnik–Chervonenkis (VC) dimension of the class C (Vapnik and Chervonenkis, 1971; Blumer et al., 1989; Ehrenfeucht et al., 1989)....

    [...]

Journal ArticleDOI
TL;DR: It is proved that the following three consistency problems for concept classes of patterns, graphs and generalized Boolean formulas, whose membership problems are known to beNP-complete, are ∑P2-complete.
Abstract: The consistency problem associated with a concept classC is to determine, given two setsA andB of examples, whether there exists a conceptc inC such that eachx inA is a positive example ofc and eachy inB is a negative example ofc. We explore in this paper the following intuition: for a concept classC, if the membership problem of determining whether a given example is positive for a concept isNP-complete, then the corresponding consistency problem is likely to be ∑ P 2 -complete. To support this intuition, we prove that the following three consistency problems for concept classes of patterns, graphs and generalized Boolean formulas, whose membership problems are known to beNP-complete, are ∑ P 2 -complete: (a) given two setsA andB of strings, determine whether there exists a patternp such that every string inA is in the languageL(p) and every string inB is not in the languageL(p); (b) given two setsA andB of graphs, determine whether there exists a graphG such that every graph inA is isomorphic to a subgraph ofG and every graph inB is not isomorphic to any subgraph ofG; and (c) given two setsA andB of Boolean formulas, determine whether there exists a 3-CNF Boolean formula θ such that for every ϕ ∈A, θ ∧ ϕ is satisfiable and for every Ψ ∈B, θ ∧ Ψ is not satisfiable. These results suggest that consistendy problems in machine learning are natural candidates for ∑ P 2 -complete problems if the corresponding membership problems are known to beNP-complete.

19 citations

References
More filters
Book
01 Jan 1979
TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Abstract: This is the second edition of a quarterly column the purpose of which is to provide a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NP-Completeness,’’ W. H. Freeman & Co., San Francisco, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed. Readers having results they would like mentioned (NP-hardness, PSPACE-hardness, polynomial-time-solvability, etc.), or open problems they would like publicized, should send them to David S. Johnson, Room 2C355, Bell Laboratories, Murray Hill, NJ 07974, including details, or at least sketches, of any new proofs (full papers are preferred). In the case of unpublished results, please state explicitly that you would like the results mentioned in the column. Comments and corrections are also welcome. For more details on the nature of the column and the form of desired submissions, see the December 1981 issue of this journal.

40,020 citations

Book
01 Jan 1968
TL;DR: The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.
Abstract: A fuel pin hold-down and spacing apparatus for use in nuclear reactors is disclosed. Fuel pins forming a hexagonal array are spaced apart from each other and held-down at their lower end, securely attached at two places along their length to one of a plurality of vertically disposed parallel plates arranged in horizontally spaced rows. These plates are in turn spaced apart from each other and held together by a combination of spacing and fastening means. The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid. This apparatus is particularly useful in connection with liquid cooled reactors such as liquid metal cooled fast breeder reactors.

17,939 citations

Book
01 Jan 1973
TL;DR: In this article, a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition is provided, including Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.
Abstract: Provides a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition. The topics treated include Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.

13,647 citations