scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Theory of Reproducing Kernels.

01 Jan 1950-Transactions of the American Mathematical Society (Defense Technical Information Center)-Vol. 68, Iss: 3, pp 337-404
TL;DR: In this paper, a short historical introduction is given to indicate the different manners in which these kernels have been used by various investigators and discuss the more important trends of the application of these kernels without attempting, however, a complete bibliography of the subject matter.
Abstract: : The present paper may be considered as a sequel to our previous paper in the Proceedings of the Cambridge Philosophical Society, Theorie generale de noyaux reproduisants-Premiere partie (vol 39 (1944)) which was written in 1942-1943 In the introduction to this paper we outlined the plan of papers which were to follow In the meantime, however, the general theory has been developed in many directions, and our original plans have had to be changed Due to wartime conditions we were not able, at the time of writing the first paper, to take into account all the earlier investigations which, although sometimes of quite a different character, were, nevertheless, related to our subject Our investigation is concerned with kernels of a special type which have been used under different names and in different ways in many domains of mathematical research We shall therefore begin our present paper with a short historical introduction in which we shall attempt to indicate the different manners in which these kernels have been used by various investigators, and to clarify the terminology We shall also discuss the more important trends of the application of these kernels without attempting, however, a complete bibliography of the subject matter (KAR) P 2

Content maybe subject to copyright    Report

Citations
More filters
Book
23 Nov 2005
TL;DR: The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification.
Abstract: A comprehensive and self-contained introduction to Gaussian processes, which provide a principled, practical, probabilistic approach to learning in kernel machines. Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics. The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.

11,357 citations


Cites background from "Theory of Reproducing Kernels."

  • ...1 (Moore-Aronszajn theorem, Aronszajn [1950])....

    [...]

  • ...The theory was developed by Aronszajn [1950]; a more recent treatise is Saitoh [1988]....

    [...]

Journal ArticleDOI
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Abstract: In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

10,696 citations


Cites background from "Theory of Reproducing Kernels."

  • ...Another possible setting also might be an operator P mapping from L2(Rn) into some Reproducing Kernel Hilbert Space (RKHS) (Aronszajn, 1950, Kimeldorf and Wahba 1971, Saitoh 1988, Schölkopf 1997, Girosi 1998)....

    [...]

  • ...Another possible setting also might be an operator P mapping from L 2(Rn) into some Reproducing Kernel Hilbert Space (RKHS) ( Aronszajn, 1950, Kimeldorf and Wahba 1971, Saitoh 1988, Sch¨ olkopf 1997, Girosi 1998)....

    [...]

Book
01 Jan 2004
TL;DR: This book provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.
Abstract: Kernel methods provide a powerful and unified framework for pattern discovery, motivating algorithms that can act on general types of data (e.g. strings, vectors or text) and look for general types of relations (e.g. rankings, classifications, regressions, clusters). The application areas range from neural networks and pattern recognition to machine learning and data mining. This book, developed from lectures and tutorials, fulfils two major roles: firstly it provides practitioners with a large toolkit of algorithms, kernels and solutions ready to use for standard pattern discovery problems in fields such as bioinformatics, text analysis, image analysis. Secondly it provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.

6,050 citations


Cites background from "Theory of Reproducing Kernels."

  • ...The use of kernels for function approximation however dates back to Aronszain [6], as does the development of much of their theory [155]....

    [...]

Journal ArticleDOI
TL;DR: This issue's collection of essays should help familiarize readers with this interesting new racehorse in the Machine Learning stable, and give a practical guide and a new technique for implementing the algorithm efficiently.
Abstract: My first exposure to Support Vector Machines came this spring when heard Sue Dumais present impressive results on text categorization using this analysis technique. This issue's collection of essays should help familiarize our readers with this interesting new racehorse in the Machine Learning stable. Bernhard Scholkopf, in an introductory overview, points out that a particular advantage of SVMs over other learning algorithms is that it can be analyzed theoretically using concepts from computational learning theory, and at the same time can achieve good performance when applied to real problems. Examples of these real-world applications are provided by Sue Dumais, who describes the aforementioned text-categorization problem, yielding the best results to date on the Reuters collection, and Edgar Osuna, who presents strong results on application to face detection. Our fourth author, John Platt, gives us a practical guide and a new technique for implementing the algorithm efficiently.

4,319 citations

Book
15 Dec 2001

4,112 citations


Additional excerpts

  • ...Roadmap • Elements of Statistical Learning Theory • Kernels and feature spaces • Support vector algorithms and other kernel methods • Applications B. Schölkopf, Canberra, February 2006...

    [...]

  • ...The Feature Space for PD Kernels [4, 1, 50] • define a feature map Φ : X → RX x → k(....

    [...]

  • ...Learning with Kernels Bernhard Schölkopf Max-Planck-Institut für biologische Kybernetik 72076 Tübingen, Germany bs@tuebingen.mpg.de B. Schölkopf, Canberra, February 2006...

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The first part of the memoir is devoted to the definition of various terms employed, and to the re-statement of the consequences which follow from Hilbert's theorem as discussed by the authors, with a discussion of the properties of functions belonging to the wider classes.
Abstract: The present memoir is the outcome of an attempt to obtain the conditions under which a given symmetric and continuous function k ( s, t ) is definite, in the sense of Hilbert. At an early stage, however, it was found that the class of definite functions was too restricted to allow the determination of necessary and sufficient conditions in terms of the determinants of § 10. The discovery that this could be done for functions of positive or negative type, and the fact that almost all the theorems which are true of definite functions are, with slight modification, true of these, led finally to the abandonment of the original plan in favour of a discussion of the properties of functions belonging to the wider classes. The first part of the memoir is devoted to the definition of various terms employed, and to the re-statement of the consequences which follow from Hilbert’s theorem.

1,988 citations

Journal ArticleDOI

1,245 citations

Journal ArticleDOI
TL;DR: In this paper, a connection between the problem of isometric imbedding and the concept of positive definite functions has been made, and it has been shown that the possibility of topological imbeddability of (E in '&) is very easily expressible in terms of the elementary function e-t2 and positive definite function (Theorem 1) if this concept is properly enlarged.
Abstract: As poo we get the space Em with the distance function maxi-, ... I xi X. Let, furthermore, lP stand for the space of real sequences with the series of pth powers of the absolute values convergent. Similarly let LP denote the space of real measurable functions in the interval (0, 1) which are summable to the pth power, while C shall mean the space of real continuous functions in the same interval. In all these spaces a distance function is assumed to be defined as usual. t L2 is equivalent to the real Hilbert space t. The spaces EmP, IP and LP are metric only if p > 1, but we shall consider them also for positive values of p O). A general theorem of Banach and Mazur ([1], p. 187) states that any separable metric space (5 may be imbedded isometrically in the space C. Furthermore, as a special case of a well known theorem of Urysohn, any such space (E may be imbedded topologically in t. Isometric imbeddability of (E in '& is, however, a much more restricted property of (B. The chief purpose of this paper is to point out the intimate relationship between the problem of isometric imbedding and the concept of positive definite functions, if this concept is properly enlarged. As a first approach to this connection we consider here isometric imbedding in Hilbert space only. It turns out that the possibility of imbedding$ in 6& is very easily expressible in terms of the elementary function e-t2 and the concept of positive definite functions (Theorem 1). The author's previous result ([10]) to the effect that i(,y), (O <,y < 1), which is the space arising from 6& by raising its metric to a

837 citations