scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Learnability and the Vapnik-Chervonenkis dimension

TL;DR: This paper shows that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.
Abstract: Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space En. The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufficient conditions are provided for feasible learnability.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
01 Jul 1997
TL;DR: This work introduces a simple model of a spiking neuron that, in addition to the weights that model the plasticity of synaptic strength, also has variable transmission delays between neurons as programmable parameters and demonstrates that temporal coding has a surprisingly large impact on the complexity of learning for single neurons.
Abstract: Spiking neurons are models for the computational units in biological neural systems where information is considered to be encoded mainly in the temporal patterns of their activity. They provide a way of analyzing neural computation that is not captured by the traditional neuron models such as sigmoidal and threshold gates (or “Perceptrons”). We introduce a simple model of a spiking neuron that, in addition to the weights that model the plasticity of synaptic strength, also has variable transmission delays between neurons as programmable parameters. For coding of input and output values two modes are taken into account: binary coding for the Boolean and analog coding for the real-valued domain. We investigate the complexity of learning for a single spiking neuron within the framework of PAC-learnability. With regard to sample complexity, we prove that the VC-dimension is O(nlogn) and, hence, strictly larger than that of a threshold gate. In particular, the lower bound holds for binary coding and even if all weights are kept fixed. The upper bound is valid for the case of analog coding if weights and delays are programmable. With regard to computational complexity, we show that there is no polynomial-time PAClearning algorithm, unless RP = NP, for a quite *Address: Institute for Theoretical Computer Science, Technische Universitgt Graz, Klosterwiesgasse 32/2, A-8010 Graz, Austria. Email: {maass,mschmitt}Qigi.tu-graz.ac.at, http://www.cis.tu-graz.ac.at/igi/ pennisioll to make di&l/hnrd copies ofall or pa11 ofthis mated for perso,,al or &qsroom use is granted without Ike provided chat lhe copia me not made or distributed for profit or commercial advantage. 11~ COPY right notice, the title of the puhlicntiou and it< date appear. ad notice is given thar copyright is hy pemkion of the ACM, Inc. TO copy othemk. to republish, to post on servers or to redistribute lo list%, requires specific permission and/or fee COLT 97 Nashville, Tennesee. USA Copy+1 1997 ACM 0-89791-891-6/97t7..S3.TO restricted spiking neuron that is only slightly more powerful than a Boolean threshold gate: The consistency problem for a spiking neuron using binary coding and programmable delays from (0, l} is NP-complete. This holds even if all weights are kept fixed. The results demonstrate that temporal coding has a surprisingly large impact on the complexity of learning for single neurons.

18 citations

Journal ArticleDOI
TL;DR: An efficient algorithm for PAC-learning a very general class of geometric concepts over R for fixed d, and a statistical query version of the algorithm that can tolerate random classification noise is presented.
Abstract: We present an efficient algorithm for PAC-learning a very general class of geometric concepts over ℛd for fixed d. More specifically, let 𝒯 be any set of s halfspaces. Let x =(x1, …, xd) be an arbitrary point in ℛd. With each t ∈ 𝒯 we associate a boolean indicator function It(x) which is 1 if and only if x is in the halfspace t. The concept class, 𝒞ds, that we study consists of all concepts formed by any Boolean function over It1, …, Its for ti ∈ 𝒯. This class is much more general than any geometric concept class known to be PAC-learnable. Our results can be extended easily to learn efficiently any Boolean combination of a polynomial number of concepts selected from any concept class 𝒞 over ℛd given that the VC-dimension of 𝒞 has dependence only on d and there is a polynomial time algorithm to determine if there is a concept from 𝒞 consistent with a given set of labeled examples. We also present a statistical query version of our algorithm that can tolerate random classification noise. Finally we present a generalization of the standard e-net result of Haussler and Welzl [1987] and apply it to give an alternative noise-tolerant algorithm for d = 2 based on geometric subdivisions.

18 citations


Cites background or methods from "Learnability and the Vapnik-Chervon..."

  • ...…the key technique used in most of this work, consider the problem of learning unions of s halfspaces in d-dimensional space for any constant d [Blumer et al. 1989; Baum 1990a; Bro¨nnimann and Goodrich 1994].1 The standard Occam algorithm draws a sufficiently large sample S of m points…...

    [...]

  • ...Thus, by Theorem 2 [Blumer et al. 1989], we get that the stated sample size suffices....

    [...]

  • ...Thus instead of using an Occam algorithm as above, we can apply Theorem 1 [Blumer et al. 1989].)...

    [...]

01 Jun 1997
TL;DR: In this article, the authors introduce a new method of approximating volumes and integrals for a vast class of geometric and number theoretical problems, motivated by recent development in VC-dimension of neural networks and corresponding semi-Pfaffian sets.
Abstract: Motivated by recent development in VC-dimension of neural networks and corresponding semi-Pfaffian sets, we introduce a new method of approximating volumes and integrals for a vast class of geometric and number theoretical problems.

18 citations

01 Dec 1990
TL;DR: A rigorous performance criterion for training algorithms for probabilistic automata (PAs) and hidden Markov models (HMMs), used extensively for speech recognition, is introduced and the complexity of the training problem as a computational problem is analyzed.
Abstract: We introduce a rigorous performance criterion for training algorithms for probabilistic automata (PAs) and hidden Markov models (HMMs), used extensively for speech recognition, and analyze the complexity of the training problem as a computational problem. The PA training problem is the problem of approximating an arbitrary, unknown source distribution by distributions generated by a PA. We investigate the following question about this important, well-studied problem: Does there exist an *efficient* training algorithm such that the trained PAs *provably converge* to a model close to an optimum one with high confidence, after only a feasibly small set of training data? We model this problem in the framework of computational learning theory and analyze the sample as well as computational complexity. We show that the number of examples required for training PAs is moderate -- essentially linear in the number of transition probabilities to be trained and a low-degree polynomial in the example length and parameters quantifying the accuracy and confidence. Computationally, however, training PAs is quite demanding: Fixed state size PAs are trainable in time polynomial in the accuracy and confidence parameters and example length, but *not* in the alphabet size, unless RP = NP. The latter result is shown via a strong non-approximability result for the single string maximum likelihood model problem for 2-state PAs, which is of independent interest.

18 citations

Journal ArticleDOI
TL;DR: The VC dimension of a leaky integrate-and-fire neuron model provides a framework for analyzing the computational capabilities of the dynamic systems defined by networks of spiking neurons and extends to arbitrary passive dendritic trees.
Abstract: We compute the VC dimension of a leaky integrate-and-fire neuron model. The VC dimension quantifies the ability of a function class to partition an input pattern space, and can be considered a measure of computational capacity. In this case, the function class is the class of integrate-and-fire models generated by varying the integration time constant T and the threshold θ, the input space they partition is the space of continuous-time signals, and the binary partition is specified by whether or not the model reaches threshold at some specified time. We show that the VC dimension diverges only logarithmically with the input signal bandwidth N. We also extend this approach to arbitrary passive dendritic trees. The main contributions of this work are (1) it offers a novel treatment of computational capacity of this class of dynamic system; and (2) it provides a framework for analyzing the computational capabilities of the dynamic systems defined by networks of spiking neurons.

18 citations

References
More filters
Book
01 Jan 1979
TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Abstract: This is the second edition of a quarterly column the purpose of which is to provide a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NP-Completeness,’’ W. H. Freeman & Co., San Francisco, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed. Readers having results they would like mentioned (NP-hardness, PSPACE-hardness, polynomial-time-solvability, etc.), or open problems they would like publicized, should send them to David S. Johnson, Room 2C355, Bell Laboratories, Murray Hill, NJ 07974, including details, or at least sketches, of any new proofs (full papers are preferred). In the case of unpublished results, please state explicitly that you would like the results mentioned in the column. Comments and corrections are also welcome. For more details on the nature of the column and the form of desired submissions, see the December 1981 issue of this journal.

40,020 citations

Book
01 Jan 1968
TL;DR: The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.
Abstract: A fuel pin hold-down and spacing apparatus for use in nuclear reactors is disclosed. Fuel pins forming a hexagonal array are spaced apart from each other and held-down at their lower end, securely attached at two places along their length to one of a plurality of vertically disposed parallel plates arranged in horizontally spaced rows. These plates are in turn spaced apart from each other and held together by a combination of spacing and fastening means. The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid. This apparatus is particularly useful in connection with liquid cooled reactors such as liquid metal cooled fast breeder reactors.

17,939 citations

Book
01 Jan 1973
TL;DR: In this article, a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition is provided, including Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.
Abstract: Provides a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition. The topics treated include Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.

13,647 citations