Showing papers by "Klaus-Robert Müller published in 1999"

PDF

Open Access

Journal Article•DOI•

Input space versus feature space in kernel-based methods

[...]

Bernhard Schölkopf, Sebastian Mika¹, C.J.C. Burges², P. Knirsch³, Klaus-Robert Müller, Gunnar Rätsch¹, Alexander J. Smola¹ - Show less +3 more•Institutions (3)

Fraunhofer Institute for Open Communication Systems¹, Alcatel-Lucent², Max Planck Society³

01 Sep 1999-IEEE Transactions on Neural Networks

TL;DR: The geometry of feature space is reviewed, and the connection between feature space and input space is discussed by dealing with the question of how one can, given some vector in feature space, find a preimage in input space.

...read moreread less

Abstract: This paper collects some ideas targeted at advancing our understanding of the feature spaces associated with support vector (SV) kernel functions. We first discuss the geometry of feature space. In particular, we review what is known about the shape of the image of input space under the feature space map, and how this influences the capacity of SV methods. Following this, we describe how the metric governing the intrinsic geometry of the mapped surface can be computed in terms of the kernel, using the example of the class of inhomogeneous polynomial kernels, which are often used in SV pattern recognition. We then discuss the connection between feature space and input space by dealing with the question of how one can, given some vector in feature space, find a preimage (exact or approximate) in input space. We describe algorithms to tackle this issue, and show their utility in two applications of kernel methods. First, we use it to reduce the computational complexity of SV decision functions; second, we combine it with the kernel PCA algorithm, thereby constructing a nonlinear statistical denoising technique which is shown to perform well on real-world data.

...read moreread less

1,258 citations

Proceedings Article•DOI•

Kernel principal component analysis

[...]

Bernhard Schölkopf¹, Alexander J. Smola, Klaus-Robert Müller•Institutions (1)

Max Planck Society¹

08 Feb 1999

TL;DR: In this paper, a nonlinear form of principal component analysis (PCA) is proposed to perform polynomial feature extraction in high-dimensional feature spaces, related to input space by some nonlinear map; for instance, the space of all possible d-pixel products in images.

...read moreread less

Abstract: A new method for performing a nonlinear form of Principal Component Analysis is proposed. By the use of integral operator kernel functions, one can efficiently compute principal components in highdimensional feature spaces, related to input space by some nonlinear map; for instance the space of all possible d-pixel products in images. We give the derivation of the method and present experimental results on polynomial feature extraction for pattern recognition.

...read moreread less

430 citations

Proceedings Article•

Invariant Feature Extraction and Classification in Kernel Spaces

[...]

Sebastian Mika, Gunnar Rätsch, Jason Weston, Bernhard Schölkopf¹, Alexander J. Smola², Klaus-Robert Müller - Show less +2 more•Institutions (2)

Microsoft¹, Australian National University²

29 Nov 1999

TL;DR: Employing a unified framework in terms of a nonlinear variant of the Rayleigh coefficient, this work proposes non-linear generalizations of Fisher's discriminant and oriented PCA using Support Vector kernel functions.

...read moreread less

Abstract: We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinear variant of the Rayleigh coefficient, we propose non-linear generalizations of Fisher's discriminant and oriented PCA using Support Vector kernel functions. Extensive simulations show the utility of our approach.

...read moreread less

207 citations

Proceedings Article•DOI•

Using support vector machines for time series prediction

[...]

Klaus-Robert Müller, Alexander J. Smola, Gunnar Rätsch¹, Bernhard Schökopf¹, Jens Kohlmorgen, Vladimir Vapnik - Show less +2 more•Institutions (1)

Max Planck Society¹

08 Feb 1999

165 citations

Proceedings Article•

Unmixing Hyperspectral Data

[...]

Lucas C. Parra¹, Clay D. Spence¹, Paul Sajda¹, Andreas Ziehe, Klaus-Robert Müller - Show less +1 more•Institutions (1)

Sarnoff Corporation¹

29 Nov 1999

TL;DR: This work assumes linear combinations of reflectance spectra with some additive normal sensor noise and derives a probabilistic MAP framework for analyzing hyperspectral data and develops an algorithm that can be understood as constrained independent component analysis (ICA).

...read moreread less

Abstract: In hyperspectral imagery one pixel typically consists of a mixture of the reflectance spectra of several materials, where the mixture coefficients correspond to the abundances of the constituting materials. We assume linear combinations of reflectance spectra with some additive normal sensor noise and derive a probabilistic MAP framework for analyzing hyperspectral data. As the material reflectance characteristics are not know a priori, we face the problem of unsupervised linear unmixing. The incorporation of different prior information (e.g. positivity and normalization of the abundances) naturally leads to a family of interesting algorithms, for example in the noise-free case yielding an algorithm that can be understood as constrained independent component analysis (ICA). Simulations underline the usefulness of our theory.

...read moreread less

158 citations

Proceedings Article•DOI•

Classification on proximity data with LP-machines

[...]

Thore Graepel, Ralf Herbrich, Bernhard Schölkopf¹, Alexander J. Smola¹, Perry F. Bartlett¹, Klaus-Robert Müller, Klaus Obermayer¹, Robert C. Williamson¹ - Show less +4 more•Institutions (1)

Australian National University¹

01 Jan 1999

TL;DR: A new linear program to deal with classification of data in the case of data given in terms of pairwise proximities, which allows to avoid the problems inherent in using feature spaces with indefinite metric in support vector machines.

...read moreread less

Abstract: We provide a new linear program to deal with classification of data in the case of data given in terms of pairwise proximities. This allows to avoid the problems inherent in using feature spaces with indefinite metric in support vector machines, since the notion of a margin is purely needed in input space where the classification actually occurs. Moreover in our approach we can enforce sparsity in the proximity representation by sacrificing training error. This turns out to be favorable for proximity data. Similar to /spl nu/-SV methods, the only parameter needed in the algorithm is the (asymptotical) number of data points being classified with a margin. Finally, the algorithm is successfully compared with /spl nu/-SV learning in proximity space and K-nearest-neighbors on real world data from neuroscience and molecular biology.

...read moreread less

96 citations

Journal Article•

Hidden Markov mixtures of experts with an application to EEG recordings from sleep

[...]

Stefan Liehr¹, Klaus Pawelzik¹, Jens Kohlmorgen², Klaus-Robert Müller²•Institutions (2)

University of Bremen¹, Fraunhofer Institute for Open Communication Systems²

01 Dec 1999-Theory in Biosciences

21 citations

Proceedings Article•

v-Arc: Ensemble Learning in the Presence of Outliers

[...]

Gunnar Rätsch, Bernhard Schölkopf¹, Alexander J. Smola, Klaus-Robert Müller, Takashi Onoda, Sebastian Mika - Show less +2 more•Institutions (1)

Microsoft¹

29 Nov 1999

TL;DR: A new boosting algorithm is proposed which allows for the possibility of a pre-specified fraction of points to lie in the margin area, even on the wrong side of the decision boundary.

...read moreread less

Abstract: AdaBoost and other ensemble methods have successfully been applied to a number of classification tasks, seemingly defying problems of overfitting. AdaBoost performs gradient descent in an error function with respect to the margin, asymptotically concentrating on the patterns which are hardest to learn. For very noisy problems, however, this can be disadvantageous. Indeed, theoretical analysis has shown that the margin distribution, as opposed to just the minimal margin, plays a crucial role in understanding this phenomenon. Loosely speaking, some outliers should be tolerated if this has the benefit of substantially increasing the margin on the remaining points. We propose a new boosting algorithm which allows for the possibility of a pre-specified fraction of points to lie in the margin area Or even on the wrong side of the decision boundary.

...read moreread less

15 citations

Proceedings Article•

Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites.

[...]

Alexander Zien, Gunnar Rätsch, Sebastian Mika, Bernhard Schölkopf, Christian Lemmen, Alexander J. Smola, Thomas Lengauer, Klaus-Robert Müller - Show less +4 more

01 Jan 1999

TL;DR: With the described techniques the recognition performance can be improved by 26% over leading existing approaches, and there is evidence that existing related methods could profit from advanced TIS recognition.

...read moreread less

15 citations

Book Chapter•DOI•

On-line learning in switching and drifting environments with application to blind source separation

[...]

Klaus-Robert Müller, Andreas Ziehe, Noboru Murata, Shun-ichi Amari

01 May 1999

TL;DR: In this article, an adaptive on-line algorithm extending the learning of learning idea is proposed and theoretically motivated, which can be applied to learning continuous functions or distributions, even when no explicit loss function is given and the Hessian is not available.

...read moreread less

Abstract: An adaptive on-line algorithm extending the learning of learning idea is proposed and theoretically motivated. Relying only on gradient flow information it can be applied to learning continuous functions or distributions, even when no explicit loss function is given and the Hessian is not available. Its efficiency is demonstrated for drifting and switching non-stationary blind separation tasks of accoustic signals. Introduction Neural networks are powerful tools that can capture the structure in data by learning. Often the batch learning paradigm is assumed, where the learner is given all training examples simultaneously and allowed to use them as often as desired. In large practical applications batch learning is experienced to be rather infeasible and instead on-line learning is employed. In the on-line learning scenario only one example is given at a time and then discarded after learning. So it is less memory consuming and at the same time it fits well into more natural learning, where the learner receives new information at every moment and should adapt to it, without having a large memory for storing old data. Appart from easier feasibility and data handling the most important advantage of on-line learning is its ability to adapt to changing environments, a quite common scenario in industrial applications where the data distribution changes gradually over time (e.g. due to wear and tear of the machines). If the learning machine does not detect and follow the change it is impossible to learn the data properly and large generalization errors will result.

...read moreread less

12 citations

Proceedings Article•DOI•

Fast change point detection in switching dynamics using a hidden Markov model of prediction experts

[...]

Jens Kohlmorgen, Steven Lemm, Klaus-Robert Müller, Stefan Liehr¹, Klaus Pawelzik¹ - Show less +1 more•Institutions (1)

Fraunhofer Institute for Open Communication Systems¹

01 Jan 1999

TL;DR: A framework for modeling switching dynamics from a time series that allows for a fast online detection of dynamical mode changes based on a hidden Markov model (HMM) of prediction experts and an input-density estimator generated for each expert.

...read moreread less

Abstract: We present a framework for modeling switching dynamics from a time series that allows for a fast online detection of dynamical mode changes. The method is based on a hidden Markov model (HMM) of prediction experts. The predictors are trained by expectation maximization (EM) and by using an annealing schedule for the HMM state probabilities. This leads to a segmentation of the time series into different dynamical modes and a simultaneous specialization of the prediction experts on the segments. In a second step, an input-density estimator is generated for each expert. It can simply be computed from the data subset assigned to the respective expert. In conjunction with the HMM state probabilities, this allows for a very fast online detection of mode changes: change points are detected as soon as the incoming input data stream contains sufficient information to indicate a change in the dynamics.

...read moreread less

Proceedings Article•DOI•

Hidden Markov mixtures of experts for prediction of non-stationary dynamics

[...]

Stefan Liehr¹, Klaus Pawelzik¹, Jens Kohlmorgen¹, S. Lemm¹, Klaus-Robert Müller¹ - Show less +1 more•Institutions (1)

University of Bremen¹

23 Aug 1999

TL;DR: In this paper, the mixtures of experts approach and a generalized hidden Markov model with an input-dependent transition matrix are combined to predict nonstationary dynamical systems by identifying appropriate sub-dynamics and an early detection of mode changes.

...read moreread less

Abstract: The prediction of non-stationary dynamical systems may be performed by identifying appropriate sub-dynamics and an early detection of mode changes. We present a framework which unifies the mixtures of experts approach and a generalized hidden Markov model with an input-dependent transition matrix: the hidden Markov mixtures of experts (HMME). The gating procedure incorporates state memory, information about the current location in phase space, and the previous prediction performance. The experts and the hidden Markov gating model are simultaneously trained by an EM algorithm that maximizes the likelihood during an annealing procedure. The HMME architecture allows for a fast online detection of mode changes: change points are detected as soon as the incoming input data stream contains sufficient information to indicate a change in the dynamics.

...read moreread less

Journal Article•DOI•

Lernen mit Kernen: Support-Vektor-Methoden zur Analyse hochdimensionaler Daten

[...]

Bernhard Schölkopf¹, Klaus-Robert Müller, Alexander J. Smola•Institutions (1)

Max Planck Society¹

01 Sep 1999-Informatik - Forschung Und Entwicklung

TL;DR: Kernel algorithms in feature spaces as elegant and efficient methods of realizing such machines are described by briefly describing industrial and academic applications, including ones where they obtained benchmark record results.

...read moreread less

Abstract: Dieser Beitrag erlautert neue Ansatze und Ergebnisse der statistischen Lerntheorie. Nach einer Einleitung wird zunachst das Lernen aus Beispielen vorgestellt und erklart, dass neben dem Erklaren der Trainingdaten die Komplexitat von Lernmaschinen wesentlich fur den Lernerfolg ist. Weiterhin werden Kern-Algorithmen in Merkmalsraumen eingefuhrt, die eine elegante und effiziente Methode darstellen, verschiedene Lernmaschinen mit kontrollierbarer Komplexitat durch Kernfunktionen zu realisieren. Beispiele fur solche Algorithmen sind Support-Vektor-Maschinen (SVM), die Kernfunktionen zur Schatzung von Funktionen verwenden, oder Kern-PCA (principal component analysis), die Kernfunktionen zur Extraktion von nichtlinearen Merkmalen aus Datensatzen verwendet. Viel wichtiger als jedes einzelne Beispiel ist jedoch die Einsicht, dass jeder Algorithmus, der sich anhand von Skalarprodukten formulieren lasst, durch Verwendung von Kernfunktionen nichtlinear verallgemeinert werden kann. Die Signifikanz der Kernalgorithmen soll durch einen kurzen Abriss einiger industrieller und akademischer Anwendungen unterstrichen werden. Hier konnten wir Rekordergebnisse auf wichtigen praktisch relevanten Benchmarks erzielen.

...read moreread less

Proceedings Article•DOI•

A multi-feature description scheme for image and video database retrieval

[...]

Jens-Rainer Ohm¹, F. Bunjamin, Wolfram Liebsch, Bela Makai, Klaus-Robert Müller, Aljosa Smolic, Detlef Zier - Show less +3 more•Institutions (1)

Heinrich Hertz Institute¹

01 Jan 1999

TL;DR: The results show that efficient search and retrieval in visual database systems is possible based on a normative feature description such as MPEG-7, and a search engine has been developed on the basis of this description scheme, which allows similarity-based retrieval from an image or video database.

...read moreread less

Abstract: This paper reports about a description scheme for visual information content, which has been developed in the context of the forthcoming MPEG-7 standard. The system supports similarity-based retrieval of visual (image and video) data along feature axes like color, texture, shape/geometry and motion. The descriptors for these features have been developed in a way such that invariance against common transformations of visual material, e.g. filtering, contrast/color manipulation, resizing etc. is achieved, and that they are fitted to human perception properties. Furthermore, descriptors have been designed that allow a fast, hierarchical search procedure. A search engine has been developed on the basis of this description scheme, which allows similarity-based retrieval from an image or video database. The results show that efficient search and retrieval in visual database systems is possible based on a normative feature description such as MPEG-7.

...read moreread less

Proceedings of the 12th International Conference on Neural Information Processing Systems

[...]

S. A. Solla, T. K. Leen, Klaus-Robert Müller

29 Nov 1999

Proceedings Article•

Hidden Markov gating for prediction of change points in switching dynamical systems.

[...]

Stefan Liehr¹, Klaus Pawelzik, Jens Kohlmorgen, Steven Lemm, Klaus-Robert Müller - Show less +1 more•Institutions (1)

University of Bremen¹

01 Jan 1999

TL;DR: This work presents a framework of a mixtures of experts architecture and a generalized hidden Markov model (HMM) with a state space dependent transition matrix that allows for a fast on line detection of mode changes in cases where the most recent input data together with the last dynamical mode contain information to indicate a dynamical change.

...read moreread less

Abstract: The prediction of switching dynamical systems requires an identi cation of each individual dynamics and an early detection of mode changes. Here we present a uni ed framework of a mixtures of experts architecture and a generalized hidden Markov model (HMM) with a state space dependent transition matrix. The specialization of the experts in the dynamical regimes and the adaptation of the switching probabilities is performed simultaneously during the training procedure. We show that our method allows for a fast on{line detection of mode changes in cases where the most recent input data together with the last dynamical mode contain su cient information to indicate a dynamical change.

...read moreread less

Journal Article•DOI•

Inequities in German research system

[...]

Stefan Jähnichen, Klaus-Robert Müller

06 May 1999-Nature

TL;DR: The editor-in-chief of Nutrition, an international medical journal, and the director of a research laboratory, found the Briefing on science and fraud most interesting, because he is both a producer and a consumer of science.

...read moreread less

Abstract: NATURE | VOL 399 | 6 MAY 1999 | www.nature.com 13 Sir — As editor-in-chief of Nutrition, an international medical journal, and as director of a research laboratory, I found your Briefing on science and fraud most interesting, because I am both a producer and a consumer of science (Nature 398, 13–17; 1999). My editorial colleagues and I have a high state of awareness of ‘fabrication, falsification and plagiarism (FFP)’. As reviewers of manuscripts, we have a difficult time detecting the two Fs, but allegations of the P have come to our attention several times. I believe that editors have an obligation to the scientific community to pass such concerns to the authors and to their institutes’ research dean or administrative supervisor in a confidential manner for investigation according to the guidelines of the US Office of Research Integrity. In doing so, we do not act as “secret police”, as the editor of the Journal of the Norwegian Medical Association maintains. Instead, we align ourselves with the UK Committee on Publication Ethics and the World Association of Medical Editors, whose recommendations are in my view appropriate. It does untold harm to the scientific community to be betrayed, deceived and defrauded. Such harm ranges from the squandering of limited research resources to the undermining of confidence and trust in the reporting of scientific findings. A journal should not be used to validate misconduct by publishing fraudulent data submitted knowingly by the author. If this occurs, editors bear an obligation to retract the paper. Our journal asks authors to sign a declaration of scientific integrity in their letter of transmittal. To avoid scientific misconduct in my laboratory, each new research fellow’s attention is drawn to this potential problem via policy and procedure material given to them on arrival, and the consequences of such temptations are clearly spelled out. Each new fellow also repeats a portion of their predecessor’s work to confirm the results, as an internal control standard. This has not dampened the lust for data among the ‘young and hungry’. But, ultimately, solid, reliable laboratory habits and supervision and mentoring are critical components to prevent misconduct. Michael M. Meguid Nutrition, Department of Surgery, 750 E. Adams St., Syracuse, New York 13210, USA Editors’ responsibility in defeating fraud

...read moreread less