Topic

# Kernel embedding of distributions

About: Kernel embedding of distributions is a research topic. Over the lifetime, 5071 publications have been published within this topic receiving 181915 citations. The topic is also known as: kernel mean & mean map.

##### Papers published on a yearly basis

##### Papers

More filters

••

TL;DR: A new method for performing a nonlinear form of principal component analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.

Abstract: A new method for performing a nonlinear form of principal component analysis is proposed. By the use of integral operator kernel functions, one can efficiently compute principal components in high-dimensional feature spaces, related to input space by some nonlinear map—for instance, the space of all possible five-pixel products in 16 × 16 images. We give the derivation of the method and present experimental results on polynomial feature extraction for pattern recognition.

8,175 citations

•

01 Jan 2004TL;DR: This book provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.

Abstract: Kernel methods provide a powerful and unified framework for pattern discovery, motivating algorithms that can act on general types of data (e.g. strings, vectors or text) and look for general types of relations (e.g. rankings, classifications, regressions, clusters). The application areas range from neural networks and pattern recognition to machine learning and data mining. This book, developed from lectures and tutorials, fulfils two major roles: firstly it provides practitioners with a large toolkit of algorithms, kernels and solutions ready to use for standard pattern discovery problems in fields such as bioinformatics, text analysis, image analysis. Secondly it provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.

6,050 citations

••

TL;DR: This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods.

Abstract: This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis.

3,566 citations

•

03 Dec 2007

TL;DR: Two sets of random features are explored, provided convergence bounds on their ability to approximate various radial basis kernels, and it is shown that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large- scale kernel machines.

Abstract: To accelerate the training of kernel machines, we propose to map the input data to a randomized low-dimensional feature space and then apply existing fast linear methods. The features are designed so that the inner products of the transformed data are approximately equal to those in the feature space of a user specified shift-invariant kernel. We explore two sets of random features, provide convergence bounds on their ability to approximate various radial basis kernels, and show that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large-scale kernel machines.

3,405 citations

••

TL;DR: This paper shows how the kernel matrix can be learned from data via semidefinite programming (SDP) techniques and leads directly to a convex method for learning the 2-norm soft margin parameter in support vector machines, solving an important open problem.

Abstract: Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information is contained in the so-called kernel matrix, a symmetric and positive semidefinite matrix that encodes the relative positions of all points. Specifying this matrix amounts to specifying the geometry of the embedding space and inducing a notion of similarity in the input space---classical model selection problems in machine learning. In this paper we show how the kernel matrix can be learned from data via semidefinite programming (SDP) techniques. When applied to a kernel matrix associated with both training and test data this gives a powerful transductive algorithm---using the labeled part of the data one can learn an embedding also for the unlabeled part. The similarity between test points is inferred from training points and their labels. Importantly, these learning problems are convex, so we obtain a method for learning both the model class and the function without local minima. Furthermore, this approach leads directly to a convex method for learning the 2-norm soft margin parameter in support vector machines, solving an important open problem.

2,419 citations