scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Learnability and the Vapnik-Chervonenkis dimension

TL;DR: This paper shows that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.
Abstract: Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space En. The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufficient conditions are provided for feasible learnability.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In the course of the upper bound proof, this work provides a simple proof of Warren's bound of the number of sign sequences of real polynomials and discusses the distribution of eigenvalues of a data covariance matrix.
Abstract: Motivated by statistical learning theoretic treatment of principal component analysis, we are concerned with the set of points in ℝd that are within a certain distance from a k-dimensional affine subspace. We prove that the VC dimension of the class of such sets is within a constant factor of (k+1)(d−k+1), and then discuss the distribution of eigenvalues of a data covariance matrix by using our bounds of the VC dimensions and Vapnik’s statistical learning theory. In the course of the upper bound proof, we provide a simple proof of Warren’s bound of the number of sign sequences of real polynomials.

10 citations


Cites background from "Learnability and the Vapnik-Chervon..."

  • ...The upper bound of the VC dimension gives non-asymptotic, distribution-independent evaluations both for the convergence rates, and for the sample complexity in the style of computational learning theory [ 3 ]....

    [...]

Book ChapterDOI
18 Sep 2007
TL;DR: This paper presents a novel approach to minimise sets of metrics by identifying and removing metrics which have little effect on the overall quality assessment, and results from an experiment are provided.
Abstract: Software metrics are an essential means to assess software quality. For the assessment of software quality, typically sets of complementing metrics are used since individual metrics cover only isolated quality aspects rather than a quality characteristic as a whole. The choice of the metrics within such metric sets, however, is non-trivial. Metrics may intuitively appear to be complementing, but they often are in fact non-orthogonal, i.e. the information they provide may overlap to some extent. In the past, such redundant metrics have been identified, for example, by statistical correlation methods. This paper presents, based on machine learning, a novel approach to minimise sets of metrics by identifying and removing metrics which have little effect on the overall quality assessment. To demonstrate the application of this approach, results from an experiment are provided. In this experiment, a set of metrics that is used to assess the analysability of test suites that are specified using the Testing and Test Control Notation (TTCN-3) is investigated.

10 citations

Dissertation
01 Jan 2017
TL;DR: This thesis presents a methodology that understands quantum machine learning as the combination of data encoding into quantum systems and quantum optimisation, and proposes several quantum algorithms for supervised pattern recognition.
Abstract: Humans are experts at recognising patterns in past experience and applying them to new tasks. For example, after seeing pictures of a face we can usually tell if another image contains the same person or not. Machine learning is a research discipline at the intersection of computer science, statistics and mathematics that investigates how pattern recognition can be performed by machines and for large amounts of data. Since a few years machine learning has come into the focus of quantum computing in which information processing based on the laws of quantum theory is explored. Although large scale quantum computers are still in the first stages of development, their theoretical description is well-understood and can be used to formulate ‘quantum software’ or ‘quantum algorithms’ for pattern recognition. Researchers can therefore analyse the impact quantum computers may have on intelligent data mining. This approach is part of the emerging research discipline of quantum machine learning that harvests synergies between quantum computing and machine learning. The research objective of this thesis is to understand how we can solve a slightly more specific problem called supervised pattern recognition based on the language that has been developed for universal quantum computers. The contribution it makes is twofold: First, it presents a methodology that understands quantum machine learning as the combination of data encoding into quantum systems and quantum optimisation. Second, it proposes several quantum algorithms for supervised pattern recognition. These include algorithms for convex and non-convex optimisation, implementations of distance-based methods through quantum interference, and the preparation of quantum states from which solutions can be derived via sampling. Amongst the machine learning methods considered are least-squares linear regression, gradient descent and Newton’s method, k-nearest neighbour, neural networks as well as ensemble methods. Together with the growing body of literature, this thesis demonstrates that quantum computing offers a number of interesting tools for machine learning applications, and has the potential to create new models of how to learn from data.

10 citations

Journal ArticleDOI
TL;DR: The analysis of the O(lglgOPT)-approximation algorithm for the standard guarding problem in which all points within the gallery are required to be visible to at least one guard is corrects and simplifies.
Abstract: We consider a generalization of the familiar art gallery problem in which individual points within the gallery need to be visible to some specified, but not necessarily uniform, number of guards. We provide an $$O(\lg \lg {\mathrm {OPT}})$$O(lglgOPT)-approximation algorithm for this multi-guarding problem in simply-connected polygonal regions, with a minimum number ($${\mathrm {OPT}}$$OPT) of vertex guards (possibly co-located). Our approximation algorithm is based on a polynomial-time algorithm for building what we call $$\varepsilon $$?-multinets of size $$O\left( \frac{1}{\varepsilon }\lg \lg \frac{1}{\varepsilon }\right) $$O1?lglg1? for the instances of Multi-HittingSet associated with our multi-guarding problem. We then apply a now-standard linear-programming technique to build an approximation algorithm from this $$\varepsilon $$?-multinet finder. This paper corrects, and simplifies the analysis of, the $$O\left( \frac{1}{\varepsilon }\lg \lg \frac{1}{\varepsilon }\right) $$O1?lglg1?-time $$\varepsilon $$?-net-finder described in [26], that was used to build an $$O(\lg \lg {\mathrm {OPT}})$$O(lglgOPT)-approximation algorithm for the standard guarding problem in which all points within the gallery are required to be visible to at least one guard.

10 citations

Proceedings ArticleDOI
01 Dec 2016
TL;DR: It is shown that active over-labeling substantially improves area under the precision-recall curve when compared with standard passive or active learning, and is presented as a new approach for learning hierarchically decomposable concepts.
Abstract: Many classification tasks target high-level concepts that can be decomposed into a hierarchy of finer-grained sub-concepts. For example, some string entities that are Locations are also Attractions, some Attractions are Museums, etc. Such hierarchies are common in named entity recognition (NER), document classification, and biological sequence analysis. We present a new approach for learning hierarchically decomposable concepts. The approach learns a high-level classifier (e.g., location vs. non-location) by seperately learning multiple finer-grained classifiers (e.g., museum vs. non-museum), and then combining the results. Soliciting labels at a finer level of granularity than that of the target concept is a new approach to active learning, which we term active over-labeling. In experiments in NER and document classification tasks, we show that active over-labeling substantially improves area under the precision-recall curve when compared with standard passive or active learning. Finally, because finer-grained labels may be more expensive to obtain, we also present a cost-sensitive active learner that uses a multi-armed bandit approach to dynamically choose the label granularity to target, and show that the bandit-based learner is robust to differences in label cost and labeling budget.

10 citations


Cites background or methods from "Learnability and the Vapnik-Chervon..."

  • ...[9] that says if the algorithm efficiently finds a hypothesis from C that is consistent with U , which is drawn iid from fixed distribution D and is of size at least...

    [...]

  • ...We define the level2 intervals by taking the union of consecutive triples of the level-3 intervals: {[0, 1), [2, 3), [4, 5)}, {[6, 7), [8, 9), [10, 11)}, and {[12, 13), [14, 15), [16, 17)}....

    [...]

  • ...[9] show that, if m is sufficiently large, then a consistent hypothesis h will...

    [...]

  • ...It is known to be NP-hard [9, 10] to find a smallest set of rectangles to cover a set of points in Rd even for d = 2....

    [...]

  • ...It is known to be NP-hard [9, 10] to find a smallest set of rectangles to cover a set of points in R even for d = 2....

    [...]

References
More filters
Book
01 Jan 1979
TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Abstract: This is the second edition of a quarterly column the purpose of which is to provide a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NP-Completeness,’’ W. H. Freeman & Co., San Francisco, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed. Readers having results they would like mentioned (NP-hardness, PSPACE-hardness, polynomial-time-solvability, etc.), or open problems they would like publicized, should send them to David S. Johnson, Room 2C355, Bell Laboratories, Murray Hill, NJ 07974, including details, or at least sketches, of any new proofs (full papers are preferred). In the case of unpublished results, please state explicitly that you would like the results mentioned in the column. Comments and corrections are also welcome. For more details on the nature of the column and the form of desired submissions, see the December 1981 issue of this journal.

40,020 citations

Book
01 Jan 1968
TL;DR: The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.
Abstract: A fuel pin hold-down and spacing apparatus for use in nuclear reactors is disclosed. Fuel pins forming a hexagonal array are spaced apart from each other and held-down at their lower end, securely attached at two places along their length to one of a plurality of vertically disposed parallel plates arranged in horizontally spaced rows. These plates are in turn spaced apart from each other and held together by a combination of spacing and fastening means. The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid. This apparatus is particularly useful in connection with liquid cooled reactors such as liquid metal cooled fast breeder reactors.

17,939 citations

Book
01 Jan 1973
TL;DR: In this article, a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition is provided, including Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.
Abstract: Provides a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition. The topics treated include Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.

13,647 citations