scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Learnability and the Vapnik-Chervonenkis dimension

TL;DR: This paper shows that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.
Abstract: Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space En. The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufficient conditions are provided for feasible learnability.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Using a deterministic analysis in a general metric space setting, this paper provides a technique for constructing a successful prediction algorithm, given a successful estimation algorithm, for the prediction of changing concepts.
Abstract: This paper examines learning problems in which the target function is allowed to change. The learner sees a sequence of random examples, labelled according to a sequence of functions, and must provide an accurate estimate of the target function sequence. We consider a variety of restrictions on how the target function is allowed to change, including infrequent but arbitrary changes, sequences that correspond to slow walks on a graph whose nodes are functions, and changes that are small on average, as measured by the probability of disagreements between consecutive functions. We first study estimation, in which the learner sees a batch of examples and is then required to give an accurate estimate of the function sequence. Our results provide bounds on the sample complexity and allowable drift rate for these problems. We also study prediction, in which the learner must produce online a hypothesis after each labelled example and the average misclassification probability over this hypothesis sequence should be small. Using a deterministic analysis in a general metric space setting, we provide a technique for constructing a successful prediction algorithm, given a successful estimation algorithm. This leads to sample complexity and drift rate bounds for the prediction of changing concepts.

77 citations

Journal ArticleDOI
TL;DR: The problem of learning boolean functions in query and mistake-bound models in the presence of irrelevant attributes is addressed and a large class of functions, including the set of monotone functions, is described, for which learnability does imply attribute-efficient learnability in this model.

77 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a previously unnoticed universal property of stress patterns in the world's languages: they are, for small neighbourhoods, neighbourhood-distinct, a locality condition defined in automata-theoretic terms.
Abstract: This paper presents a previously unnoticed universal property of stress patterns in the world's languages: they are, for small neighbourhoods, neighbourhood-distinct. Neighbourhood-distinctness is a locality condition defined in automata-theoretic terms. This universal is established by examining stress patterns contained in two typological studies. Strikingly, many logically possible – but unattested – patterns do not have this property. Not only does neighbourhood-distinctness unite the attested patterns in a non-trivial way, it also naturally provides an inductive principle allowing learners to generalise from limited data. A learning algorithm is presented which generalises by failing to distinguish same-neighbourhood environments perceived in the learner's linguistic input – hence learning neighbourhood-distinct patterns – as well as almost every stress pattern in the typology. In this way, this work lends support to the idea that properties of the learner can explain certain properties of the attested typology, an idea not straightforwardly available in optimality-theoretic and Principle and Parameter frameworks.

77 citations

Proceedings ArticleDOI
29 May 1995
TL;DR: It is shown that an honest class is exactly polynomial-query learnable if and only if it is learnable using an oracle for Γp4, and a new relationship between query complexity and time complexity in exact learning is shown.
Abstract: We investigate the query complexity of exact learning in the membership and (proper) equivalence query model. We give a complete characterization of concept classes that are learnable with a polynomial number of polynomial sized queries in this model. We give applications of this characterization, including results on learning a natural subclass of DNF formulas, and on learning with membership queries alone. Query complexity has previously been used to prove lower bounds on the time complexity of exact learning. We show a new relationship between query complexity and time complexity in exact learning: If any “honest” class is exactly and properly learnable with polynomial query complexity, but not learnable in polynomial time, then P = NP. In particular, we show that an honest class is exactly polynomial-query learnable if and only if it is learnable using an oracle for Γp4.

75 citations

Journal ArticleDOI
01 Jan 1989
TL;DR: A modification of Valiant's distribution-independent protocol for learning is proposed in which the distribution and the function to be learned may be chosen by adversaries, however these adversaries may not communicate.
Abstract: Within the context of Valiant's protocol for learning, the Perceptron algorithm is shown to learn an arbitrary half-space in time O(n2/e3) if D, the probability distribution of examples, is taken uniform over the unit sphere Sn. Here e is the accuracy parameter. This is surprisingly fast, as "standard" approaches involve solution of a linear programming problem involving Ω(n/e) constraints in n dimensions. A modification of Valiant's distribution independent protocol for learning is proposed in which the distribution and the function to be learned may be chosen by adversaries, however these adversaries may not communicate. It is argued that this definition is more reasonable and applicable to real world learning than Valiant's. Under this definition, the Perceptron algorithm is shown to be a distribution independent learning algorithm. In an appendix we show that, for uniform distributions, some classes of infinite V-C dimension including convex sets and a class of nested differences of convex sets are learnable.

75 citations


Cites background or result from "Learnability and the Vapnik-Chervon..."

  • ...Third, the results of [Blumer et aI, 1987] imply that we can only expect to learn a class of functions F if F has finite V-C dimension....

    [...]

  • ...This would suffice to assure that the hypothesis half space so generated would (with confidence 1 -0) have error less than €, as is seen from [Blumer et aI, 1987, Theorem A3....

    [...]

  • ...In particular, if F has Vapnik-Chervonenkis (V-C) dimension l1 d, then it has been proved[Blumer et al, 1987] that all A needs to do to be a valid learning algorithm is to call MO(f, 8, d) = max(~logj, Sfdlog1f3) examples and to find in polynomial time a function 9 E F which correctly classifies…...

    [...]

  • ...Thus, for example, it is simple to show that the class H of half spaces is Valiant learnable[Blumer et aI, 1987]....

    [...]

  • ...First, although the results of [Blumer et al., 1987] tell us we can gather enough information for learning in polynomial time, they say nothing about when we can actually find an algorithm A which learns in polynomial time....

    [...]

References
More filters
Book
01 Jan 1979
TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Abstract: This is the second edition of a quarterly column the purpose of which is to provide a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NP-Completeness,’’ W. H. Freeman & Co., San Francisco, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed. Readers having results they would like mentioned (NP-hardness, PSPACE-hardness, polynomial-time-solvability, etc.), or open problems they would like publicized, should send them to David S. Johnson, Room 2C355, Bell Laboratories, Murray Hill, NJ 07974, including details, or at least sketches, of any new proofs (full papers are preferred). In the case of unpublished results, please state explicitly that you would like the results mentioned in the column. Comments and corrections are also welcome. For more details on the nature of the column and the form of desired submissions, see the December 1981 issue of this journal.

40,020 citations

Book
01 Jan 1968
TL;DR: The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.
Abstract: A fuel pin hold-down and spacing apparatus for use in nuclear reactors is disclosed. Fuel pins forming a hexagonal array are spaced apart from each other and held-down at their lower end, securely attached at two places along their length to one of a plurality of vertically disposed parallel plates arranged in horizontally spaced rows. These plates are in turn spaced apart from each other and held together by a combination of spacing and fastening means. The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid. This apparatus is particularly useful in connection with liquid cooled reactors such as liquid metal cooled fast breeder reactors.

17,939 citations

Book
01 Jan 1973
TL;DR: In this article, a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition is provided, including Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.
Abstract: Provides a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition. The topics treated include Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.

13,647 citations