Showing papers by "Klaus-Robert Müller published in 2001"

PDF

Open Access

Journal Article•DOI•

An introduction to kernel-based learning algorithms

[...]

Klaus-Robert Müller, Sebastian Mika, Gunnar Rätsch, Koji Tsuda, Bernhard Schölkopf - Show less +1 more

01 Mar 2001-IEEE Transactions on Neural Networks

TL;DR: This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods.

...read moreread less

Abstract: This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis.

...read moreread less

3,566 citations

Journal Article•DOI•

Soft Margins for AdaBoost

[...]

Gunnar Rätsch, Takashi Onoda, Klaus-Robert Müller¹•Institutions (1)

University of Potsdam¹

01 Mar 2001-Machine Learning

TL;DR: It is found that ADABOOST asymptotically achieves a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns that are interestingly very similar to Support Vectors.

...read moreread less

Abstract: Recently ensemble methods like ADABOOST have been applied successfully in many problems, while seemingly defying the problems of overfitting. ADABOOST rarely overfits in the low noise regime, however, we show that it clearly does so for higher noise levels. Central to the understanding of this fact is the margin distribution. ADABOOST can be viewed as a constraint gradient descent in an error function with respect to the margin. We find that ADABOOST asymptotically achieves a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns that are interestingly very similar to Support Vectors. A hard margin is clearly a sub-optimal strategy in the noisy case, and regularization, in our case a “mistrust” in the data, must be introduced in the algorithm to alleviate the distortions that single difficult patterns (e.g. outliers) can cause to the margin distribution. We propose several regularization methods and generalizations of the original ADABOOST algorithm to achieve a soft margin. In particular we suggest (1) regularized ADABOOSTREG where the gradient decent is done directly with respect to the soft margin and (2) regularized linear and quadratic programming (LP/QP-) ADABOOST, where the soft margin is attained by introducing slack variables. Extensive simulations demonstrate that the proposed regularized ADABOOST-type algorithms are useful and yield competitive results for noisy data.

...read moreread less

1,367 citations

Proceedings Article•

Classifying Single Trial EEG: Towards Brain Computer Interfacing

[...]

Benjamin Blankertz, Gabriel Curio¹, Klaus-Robert Müller²•Institutions (2)

Free University of Berlin¹, University of Potsdam²

03 Jan 2001

TL;DR: This work detects upcoming finger movements in a natural keyboard typing condition and predicts their laterality in a pseudo-online simulation, and compares discriminative classifiers like Support Vector Machines (SVMs) and different variants of Fisher Discriminant that possess favorable regularization properties for dealing with high noise cases.

...read moreread less

Abstract: Driven by the progress in the field of single-trial analysis of EEG, there is a growing interest in brain computer interfaces (BCIs), i.e., systems that enable human subjects to control a computer only by means of their brain signals. In a pseudo-online simulation our BCI detects upcoming finger movements in a natural keyboard typing condition and predicts their laterality. This can be done on average 100-230ms before the respective key is actually pressed, i.e., long before the onset of EMG. Our approach is appealing for its short response time and high classification accuracy (>96%) in a binary decision where no human training is involved. We compare discriminative classifiers like Support Vector Machines (SVMs) and different variants of Fisher Discriminant that possess favorable regularization properties for dealing with high noise cases (inter-trial variablity).

...read moreread less

496 citations

Journal Article•DOI•

A New Discriminative Kernel From Probabilistic Models

[...]

Koji Tsuda¹, Motoaki Kawanabe, Gunnar Rätsch², Sören Sonnenburg, Klaus-Robert Müller³ - Show less +1 more•Institutions (3)

National Institute of Advanced Industrial Science and Technology¹, Australian National University², University of Potsdam³

03 Jan 2001

TL;DR: This work proposes a new discriminative TOP kernel derived from tangent vectors of posterior log-odds and develops a theoretical framework on feature extractors from probabilistic models and uses it for analyzing the TOP kernel.

...read moreread less

Abstract: Recently, Jaakkola and Haussler (1999) proposed a method for constructing kernel functions from probabilistic models. Their so-called Fisher kernel has been combined with discriminative classifiers such as support vector machines and applied successfully in, for example, DNA and protein analysis. Whereas the Fisher kernel is calculated from the marginal log-likelihood, we propose the TOP kernel derived from tangent vectors of posterior log-odds. Furthermore, we develop a theoretical framework on feature extractors from probabilistic models and use it for analyzing the TOP kernel. In experiments, our new discriminative TOP kernel compares favorably to the Fisher kernel.

...read moreread less

154 citations

Book Chapter•DOI•

Learning to Predict the Leave-One-Out Error of Kernel Based Classifiers

[...]

Koji Tsuda¹, Gunnar Rätsch², Sebastian Mika, Klaus-Robert Müller²•Institutions (2)

Max Planck Society¹, University of Potsdam²

21 Aug 2001

TL;DR: An algorithm to predict the leave-one-out (LOO) error for kernel based classifiers is proposed, inspired by geometrical intuition and allows to reliably select a good model as demonstrated in simulations on Support Vector and Linear Programming Machines.

...read moreread less

Abstract: We propose an algorithm to predict the leave-one-out (LOO) error for kernel based classifiers. To achieve this goal with computational efficiency, we cast the LOO error approximation task into a classification problem. This means that we need to learn a classification of whether or not a given training sample - if left out of the data set - would be misclassified. For this learning task, simple data dependent features are proposed, inspired by geometrical intuition. Our approach allows to reliably select a good model as demonstrated in simulations on Support Vector and Linear Programming Machines. Comparisons to existing learning theoretical bounds, e.g. the span bound, are given for various model selection scenarios.

...read moreread less

33 citations

Proceedings Article•

Kernel Feature Spaces and Nonlinear Blind Souce Separation

[...]

Stefan Harmeling, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller¹•Institutions (1)

University of Potsdam¹

03 Jan 2001

TL;DR: A new mathematical construction is proposed that permits to adapt to the intrinsic dimension of the data and to find an orthonormal basis of this submanifold and allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings.

...read moreread less

Abstract: In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.

...read moreread less

29 citations

Journal Article•DOI•

Noise robust estimates of correlation dimension and K2 entropy.

[...]

Guido Nolte¹, Andreas Ziehe, Klaus-Robert Müller²•Institutions (2)

University of New Mexico¹, University of Potsdam²

15 Jun 2001-Physical Review E

TL;DR: Using Gaussian kernels to define the correlation sum, it is shown theoretically that the estimates, which are derived for additive white Gaussian noise, are also robust for moderately colored noise and underline the usefulness of the proposed correction schemes.

...read moreread less

Abstract: Using Gaussian kernels to define the correlation sum we derive simple formulas that correct the noise bias in estimates of the correlation dimension and K2 entropy of chaotic time series. The corrections are only based on the difference of correlation dimensions for adjacent embedding dimensions and hence preserve the full functional dependencies on both the scale parameter and embedding dimension. It is shown theoretically that the estimates, which are derived for additive white Gaussian noise, are also robust for moderately colored noise. Simulations underline the usefulness of the proposed correction schemes. It is demonstrated that the method gives satisfactory results also for non-Gaussian and dynamical noise.

...read moreread less

25 citations

Proceedings Article•

Estimating the Reliability of ICA Projections

[...]

Frank C. Meinecke¹, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller¹•Institutions (1)

University of Potsdam¹

03 Jan 2001

TL;DR: In this article, the authors use resampling methods to assess the quality of the discovered projections and show experimentally that their proposed variance estimations are strongly correlated with the separation error, and demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance.

...read moreread less

Abstract: When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. Application to 49-channel-data from an magnetoencephalography (MEG) experiment underlines the usefulness of our approach.

...read moreread less

15 citations

Journal Article•DOI•

Size Segregated Characterization of Fine Particulate Matter in Leipzig and Melpitz Aerosol

[...]

Erika Brüggemann, Thomas Gnauk, Klaus-Robert Müller, Hartmut Herrmann

01 Sep 2001-Journal of Aerosol Science

1 citations

Journal Article•DOI•

The Long-Term Trend of PM 10, PM 2.5 and PM 1 Particle Concentration at the Rural Melpitz Site in Saxony (germany)

[...]

Gerald Spindler, Erika Brüggemann, Th. Gnauk, Klaus-Robert Müller, Hartmut Herrmann - Show less +1 more