scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Learnability and the Vapnik-Chervonenkis dimension

TL;DR: This paper shows that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.
Abstract: Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space En. The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufficient conditions are provided for feasible learnability.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
12 Nov 2018
TL;DR: This study used a convolutional neural network (CNN) to automatically extract characteristics from the input of raw EMG image and a Support Vector Machine (SVM) classifier was employed to identify the hand motions.
Abstract: A synthetic approach was proposed to improve the recognition accuracy. Different with the traditional feature extractors, this study used a convolutional neural network (CNN) to automatically extract characteristics from the input of raw EMG image. Then, a Support Vector Machine (SVM) classifier was employed to identify the hand motions. Our experiments showed that the proposed method achieved the accuracy around 2.5% higher than the use of CNN only, and about 9.7% higher than the use of traditional method (i.e. the use of time domain feature and a SVM classifier). Both inter-subject and inter-session tests demonstrated the robustness of the CNN-based feature.

14 citations


Cites methods from "Learnability and the Vapnik-Chervon..."

  • ...On the one hand, compared with neural network classifier, which is based on the learning algorithm of empirical risk minimization [20] and easy to fall into a local optimal solution, SVM is related to the statistical theory of VC dimension theory [21] and structural risk minimization [22] theory....

    [...]

Proceedings ArticleDOI
23 Mar 1992
TL;DR: The authors derive necessary and sufficient conditions on the number of training examples required in order to guarantee a particular generalization performance and compare the bounds obtained to those available for (multilayer) feedforward networks of linear threshold elements (LTEs).
Abstract: The pattern classification and generalization ability of the class of generalized single-layer networks (GSLNs) using techniques from computational learning theory is analyzed. The authors derive necessary and sufficient conditions on the number of training examples required in order to guarantee a particular generalization performance and compare the bounds obtained to those available for (multilayer) feedforward networks of linear threshold elements (LTEs). This allows one to show that, on the basis of currently available bounds, the sufficient number of training examples for GSLNs tends to be considerably less than for feedforward networks of LTEs with the same number of weights. It is also shown that the use of self-structuring techniques for GSLNs may reduce the number of training examples sufficient for good generalization. An explanation for the fact that GSLNs can require a relatively large number of weights is given. >

14 citations


Cites background from "Learnability and the Vapnik-Chervon..."

  • ...THEOREM 3.2 (VAPNIK [9], BLUMER et a1 [ 7 ]) Let T be a well-behaved class of functions f : R" + {+l, -1) and let 0 < y 5 1, 6 < 1 and 0 <: E. Let Tk be a sequence of k ezamples drawn independently according to the distribution P' on R" x {+l, -1) and let P be the probability that there is some junction j E 3 which disagrees with at most a fraction (1 - 7)~ of the ezamples in Tk but has error greater...

    [...]

  • ...We first require an environment X which in this paper is always R". A concept class C and an hypothesis space H are both defined as sets of subsets of X. Any element of C or H must be a Bore1 set and we may also require that C and H are well-behaved as defined in [ 7 ]; these requirements are discussed in full in [SI where in particular we show that all spaces used in this paper are well-behaved where necessary....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors prove general lower bounds on the number of examples needed for learning function classes within different natural learning models which are related to pac-learning and coincide with the pac learning model of Valiant in the case of {0, 1}-valued functions.
Abstract: We prove general lower bounds on the number of examples needed for learning function classes within different natural learning models which are related to pac-learning (and coincide with the pac-learning model of Valiant in the case of {0,1}-valued functions). The lower bounds are obtained by showing that all nontrivial function classes contain a "hard binary-valued subproblem." Although (at first glance) it seems to be likely that real-valued function classes are much harder to learn than their hardest binary-valued subproblem, we show that these general lower bounds cannot be improved by more than a logarithmic factor. This is done by discussing some natural function classes like nondecreasing functions or piecewise-smooth functions (the function classes that were discussed in [M. J. Kearns and R. E. Schapire, Proc. 31st Annual Symposium on the Foundations of Computer Science, IEEE Computer Society Press, Los Alamitos, CA, 1990, pp. 382--392, full version, J. Comput. System Sci., 48 (1994), pp. 464--497], [D. Kimber and P. M. Long, Proc. 5th Annual Workshop on Computational Learning Theory, ACM, New York, 1992, pp. 153--160]) with certain restrictions concerning their slope.

14 citations

Book ChapterDOI
01 Jan 1993
TL;DR: This paper presents a learning algorithm that implements tree-structured bias, i.e., learns any target function probably approximately correctly from random examples and membership queries if it obeys a given tree- Structured bias.
Abstract: Incorporating declarative bias or prior knowledge into learning is an active research topic in machine learning. Tree-structured bias specifies the prior knowledge as a tree of “relevance” relationships between attributes. This paper presents a learning algorithm that implements tree-structured bias, i.e., learns any target function probably approximately correctly from random examples and membership queries if it obeys a given tree-structured bias. The theoretical predictions of the paper are empirically validated.

14 citations

Proceedings Article
01 Jan 2020
TL;DR: While previous negative results showed this model to have intractably large sample complexity for label queries, it is shown that comparison queries make RPU-learning at worst logarithmically more expensive in both the passive and active regimes.
Abstract: In the world of big data, large but costly to label datasets dominate many fields. Active learning, a semi-supervised alternative to the standard PAC-learning model, was introduced to explore whether adaptive labeling could learn concepts with exponentially fewer labeled samples. While previous results show that active learning performs no better than its supervised alternative for important concept classes such as linear separators, we show that by adding weak distributional assumptions and allowing comparison queries, active learning requires exponentially fewer samples. Further, we show that these results hold as well for a stronger model of learning called Reliable and Probably Useful (RPU) learning. In this model, our learner is not allowed to make mistakes, but may instead answer "I don't know." While previous negative results showed this model to have intractably large sample complexity for label queries, we show that comparison queries make RPU-learning at worst logarithmically more expensive in both the passive and active regimes.

14 citations

References
More filters
Book
01 Jan 1979
TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Abstract: This is the second edition of a quarterly column the purpose of which is to provide a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NP-Completeness,’’ W. H. Freeman & Co., San Francisco, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed. Readers having results they would like mentioned (NP-hardness, PSPACE-hardness, polynomial-time-solvability, etc.), or open problems they would like publicized, should send them to David S. Johnson, Room 2C355, Bell Laboratories, Murray Hill, NJ 07974, including details, or at least sketches, of any new proofs (full papers are preferred). In the case of unpublished results, please state explicitly that you would like the results mentioned in the column. Comments and corrections are also welcome. For more details on the nature of the column and the form of desired submissions, see the December 1981 issue of this journal.

40,020 citations

Book
01 Jan 1968
TL;DR: The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.
Abstract: A fuel pin hold-down and spacing apparatus for use in nuclear reactors is disclosed. Fuel pins forming a hexagonal array are spaced apart from each other and held-down at their lower end, securely attached at two places along their length to one of a plurality of vertically disposed parallel plates arranged in horizontally spaced rows. These plates are in turn spaced apart from each other and held together by a combination of spacing and fastening means. The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid. This apparatus is particularly useful in connection with liquid cooled reactors such as liquid metal cooled fast breeder reactors.

17,939 citations

Book
01 Jan 1973
TL;DR: In this article, a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition is provided, including Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.
Abstract: Provides a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition. The topics treated include Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.

13,647 citations