scispace - formally typeset
Search or ask a question

Showing papers by "Klaus-Robert Müller published in 1998"


Journal ArticleDOI
TL;DR: A new method for performing a nonlinear form of principal component analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.
Abstract: A new method for performing a nonlinear form of principal component analysis is proposed. By the use of integral operator kernel functions, one can efficiently compute principal components in high-dimensional feature spaces, related to input space by some nonlinear map—for instance, the space of all possible five-pixel products in 16 × 16 images. We give the derivation of the method and present experimental results on polynomial feature extraction for pattern recognition.

8,175 citations


Proceedings Article
01 Jan 1998

1,543 citations


Proceedings Article
01 Dec 1998
TL;DR: This work presents ideas for finding approximate pre-images, focusing on Gaussian kernels, and shows experimental results using these pre- images in data reconstruction and de-noising on toy examples as well as on real world data.
Abstract: Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered as a natural generalization of linear principal component analysis. This gives rise to the question how to use nonlinear features for data compression, reconstruction, and de-noising, applications common in linear PCA. This is a nontrivial task, as the results provided by kernel PCA live in some high dimensional feature space and need not have pre-images in input space. This work presents ideas for finding approximate pre-images, focusing on Gaussian kernels, and shows experimental results using these pre-images in data reconstruction and de-noising on toy examples as well as on real world data.

1,031 citations


Journal ArticleDOI
TL;DR: It is proved that the Green's Functions associated with regularization operators are suitable support vector kernels with equivalent regularization properties and it is shown that a large number of radial basis functions, namely conditionally positive definite functions, may be used as support vector kernel.

668 citations


Book ChapterDOI
02 Sep 1998
TL;DR: An algorithm for blind source separation based on several time-delayed second order correlation matrices is proposed and its efficiency and stability are demonstrated for linear artificial mixtures with 17 sources.
Abstract: An algorithm for blind source separation based on several time-delayed second order correlation matrices is proposed. The technique to construct the unmixing matrix employs first a whitening step and then an approximate simultaneous diagonalisation of several time-delayed second order correlation matrices. Its efficiency and stability are demonstrated for linear artificial mixtures with 17 sources.

398 citations


Book ChapterDOI
02 Sep 1998
TL;DR: In this article, the authors show that there exists a nontrivial choice of the insensitivity parameter in Vapnik's e-insensitive loss function which scales linearly with the input noise of the training data.
Abstract: Under the assumption of asymptotically unbiased estimators we show that there exists a nontrivial choice of the insensitivity parameter in Vapnik’s e-insensitive loss function which scales linearly with the input noise of the training data. This finding is backed by experimental results.

130 citations


Book ChapterDOI
02 Sep 1998
TL;DR: This work has shown that the reconstruction of patterns from their largest nonlinear principal components, a technique which is common practice in linear principal component analysis, can be performed using a kernel without explicitly working in F.
Abstract: Algorithms based on Mercer kernels construct their solutions in terms of expansions in a high-dimensional feature space F. Previous work has shown that all algorithms which can be formulated in terms of dot products in F can be performed using a kernel without explicitly working in F. The list of such algorithms includes support vector machines and nonlinear kernel principal component extraction. So far, however, it did not include the reconstruction of patterns from their largest nonlinear principal components, a technique which is common practice in linear principal component analysis.

123 citations


Proceedings Article
01 Jan 1998
TL;DR: The paper shows asymptotic experimental results with RBF networks for the binary classi-cation case and proposes a regularized improved version of AdaBoost, called AdaBoostreg, which is shown the usefulness of this improvement in numerical simulations.
Abstract: Recent work has shown that combining multiple versions of weak classiiers such as decision trees or neural networks results in reduced test set error. To study this in greater detail, we analyze the asymptotic behavior of AdaBoost. The theoretical analysis establishes the relation between the distribution of margins of the training examples and the generated voting classiication rule. The paper shows asymptotic experimental results with RBF networks for the binary classi-cation case underlining the theoretical ndings. Our experiments show that AdaBoost does overrt, indeed. In order to avoid this and to get better generalization performance, we propose a regularized improved version of AdaBoost, which is called AdaBoostreg. We show the usefulness of this improvement in numerical simulations.

56 citations


Book ChapterDOI
02 Sep 1998
TL;DR: The paper shows asymptotic experimental results for the binary classification case and the relation between the model complexity and noise in the training data, and how to improve AdaBoost type algorithms in practice are discussed.
Abstract: Recent work has shown that combining multiple versions of weak classifiers such as decision trees or neural networks results in reduced test set error. To study this in greater detail, we analyze the asymptotic behavior of AdaBoost type algorithms. The theoretical analysis establishes the relation between the distribution of margins of the training examples and the generated voting classification rule. The paper shows asymptotic experimental results for the binary classification case underlining the theoretical findings. Finally, the relation between the model complexity and noise in the training data, and how to improve AdaBoost type algorithms in practice are discussed.

22 citations



Book ChapterDOI
02 Sep 1998
TL;DR: The concept of Support Vector Regression is extended and it is shown how the resulting convex constrained optimization problems can be efficiently solved by a Primal-Dual Interior Point path following method.
Abstract: The concept of Support Vector Regression is extended to a more general class of convex cost functions. It is shown how the resulting convex constrained optimization problems can be efficiently solved by a Primal-Dual Interior Point path following method. Both computational feasibility and improvement of estimation is demonstrated in the experiments.

Patent
11 Sep 1998
TL;DR: In this paper, a time series of at least one system variable x(t) is subjected to modeling, for example switch segmentation, so that in each time segment of a predetermined minimum length a predetermined prediction model for a system mode is detected for each system variable, whereby modeling of the time series is followed by drift segmentation in which, in each transition of the system from a first system mode to a second system mode, a series of mixed prediction models is detected produced by linear, paired superimposition of the prediction models of the two system modes.
Abstract: In a method for detecting the modes of a dynamic system with a large number of modes that each have a set α (t) of characteristic system parameters, a time series of at least one system variable x(t) is subjected to modeling, for example switch segmentation, so that in each time segment of a predetermined minimum length a predetermined prediction model, for example a neural network, for a system mode is detected for each system variable x(t), whereby modeling of the time series is followed by drift segmentation in which, in each time segment in which there is transition of the system from a first system mode to a second system mode, a series of mixed prediction models is detected produced by linear, paired superimposition of the prediction models of the two system modes.

Proceedings ArticleDOI
31 May 1998
TL;DR: Different low-complexity approaches for generation of virtual-viewpoint camera signals are described, which are based on disparity-processing techniques and can hence be implemented with much lower complexity than full 3D analysis of natural objects or scenes.
Abstract: Viewpoint adaptation from multiple-viewpoint video captures is an important tool for telepresence illusion in stereoscopic presentation of natural scenes, and for the integration of real-world video objects into virtual 3D worlds. This paper describes different low-complexity approaches for generation of virtual-viewpoint camera signals, which are based on disparity-processing techniques and can hence be implemented with much lower complexity than full 3D analysis of natural objects or scenes. A realtime hardware system, which is based on one of our algorithms, has already been developed.


Journal ArticleDOI
TL;DR: It is shown that predicting the continuation of Data Set A is nothing else than a pattern matching problem, and it is demonstrated that simple pattern matching performs as good as sophisticated prediction methods on Data set A.
Abstract: Several data sets have been proposed for benchmarking in time series prediction. A popular one is Data Set A from the Santa Fe Competition. This data set was the subject of analysis in many papers. In this note, it is shown that predicting the continuation of Data Set A is nothing else than a pattern matching problem. Looking at studies of this data set, it is remarkable that most of the very good forecasts of Data Set A used upsampled training data. We explain why upsampling is crucial for this data set. Finally, it is demonstrated that simple pattern matching performs as good as sophisticated prediction methods on Data Set A.


Proceedings Article
01 Jan 1998