scispace - formally typeset
Search or ask a question

Showing papers by "Klaus-Robert Müller published in 2004"


Journal ArticleDOI
TL;DR: The BCI Competition 2003 was organized to evaluate the current state of the art of signal processing and classification methods and the results and function of the most successful algorithms were described.
Abstract: Interest in developing a new method of man-to-machine communication-a brain-computer interface (BCI)-has grown steadily over the past few decades. BCIs create a new communication channel between the brain and an output device by bypassing conventional motor output pathways of nerves and muscles. These systems use signals recorded from the scalp, the surface of the cortex, or from inside the brain to enable users to control a variety of applications including simple word-processing software and orthotics. BCI technology could therefore provide a new communication and control option for individuals who cannot otherwise express their wishes to the outside world. Signal processing and classification methods are essential tools in the development of improved BCI technology. We organized the BCI Competition 2003 to evaluate the current state of the art of these tools. Four laboratories well versed in EEG-based BCI research provided six data sets in a documented format. We made these data sets (i.e., labeled training sets and unlabeled test sets) and their descriptions available on the Internet. The goal in the competition was to maximize the performance measure for the test labels. Researchers worldwide tested their algorithms and competed for the best classification results. This paper describes the six data sets and the results and function of the most successful algorithms.

667 citations


Journal ArticleDOI
TL;DR: It is shown that a suitably arranged interaction between these concepts can significantly boost BCI performances and derive information-theoretic predictions and demonstrate their relevance in experimental data.
Abstract: Noninvasive electroencephalogram (EEG) recordings provide for easy and safe access to human neocortical processes which can be exploited for a brain-computer interface (BCI). At present, however, the use of BCIs is severely limited by low bit-transfer rates. We systematically analyze and develop two recent concepts, both capable of enhancing the information gain from multichannel scalp EEG recordings: 1) the combination of classifiers, each specifically tailored for different physiological phenomena, e.g., slow cortical potential shifts, such as the premovement Bereitschaftspotential or differences in spatio-spectral distributions of brain activity (i.e., focal event-related desynchronizations) and 2) behavioral paradigms inducing the subjects to generate one out of several brain states (multiclass approach) which all bare a distinctive spatio-temporal signature well discriminable in the standard scalp EEG. We derive information-theoretic predictions and demonstrate their relevance in experimental data. We will show that a suitably arranged interaction between these concepts can significantly boost BCI performances.

614 citations


Journal Article
TL;DR: A new efficient algorithm is presented for joint diagonalization of several matrices based on the Frobenius-norm formulation of the joint diagonalized problem, and addresses diagonalization with a general, non-orthogonal transformation.
Abstract: A new efficient algorithm is presented for joint diagonalization of several matrices. The algorithm is based on the Frobenius-norm formulation of the joint diagonalization problem, and addresses diagonalization with a general, non-orthogonal transformation. The iterative scheme of the algorithm is based on a multiplicative update which ensures the invertibility of the diagonalizer. The algorithm's efficiency stems from the special approximation of the cost function resulting in a sparse, block-diagonal Hessian to be used in the computation of the quasi-Newton update step. Extensive numerical simulations illustrate the performance of the algorithm and provide a comparison to other leading diagonalization methods. The results of such comparison demonstrate that the proposed algorithm is a viable alternative to existing state-of-the-art joint diagonalization algorithms. The practical use of our algorithm is shown for blind source separation problems.

266 citations


Journal Article
TL;DR: It is shown by a simple, exploratory analysis that the negative eigenvalues can code for relevant structure in the data, thus leading to the discovery of new features, which were lost by conventional data analysis techniques.
Abstract: Pairwise proximity data, given as similarity or dissimilarity matrix, can violate metricity. This occurs either due to noise, fallible estimates, or due to intrinsic non-metric features such as they arise from human judgments. So far the problem of non-metric pairwise data has been tackled by essentially omitting the negative eigenvalues or shifting the spectrum of the associated (pseudo-)covariance matrix for a subsequent embedding. However, little attention has been paid to the negative part of the spectrum itself. In particular no answer was given to whether the directions associated to the negative eigenvalues would at all code variance other than noise related. We show by a simple, exploratory analysis that the negative eigenvalues can code for relevant structure in the data, thus leading to the discovery of new features, which were lost by conventional data analysis techniques. The information hidden in the negative eigenvalue part of the spectrum is illustrated and discussed for three data sets, namely USPS handwritten digits, text-mining and data from cognitive psychology.

102 citations


Journal ArticleDOI
TL;DR: This contribution proposes a novel formulation of a one-class Support Vector Machine (SVM) specially designed for typical IDS data features, to encompass the data with a hypersphere anchored at the center of mass of the data in feature space.
Abstract: Practical application of data mining and machine learning techniques to intrusion detection is often hindered by the difficulty to produce clean data for the training. To address this problem a geometric framework for unsupervised anomaly detection has been recently proposed. In this framework, the data is mapped into a feature space, and anomalies are detected as the entries in sparsely populated regions. In this contribution we propose a novel formulation of a one-class Support Vector Machine (SVM) specially designed for typical IDS data features. The key idea of our ”quarter-sphere” algorithm is to encompass the data with a hypersphere anchored at the center of mass of the data in feature space. The proposed method and its behavior on varying percentages of attacks in the data is evaluated on the KDDCup 1999 dataset.

88 citations



Book ChapterDOI
22 Sep 2004
TL;DR: In this paper, a non-unitary approximate joint diagonalization (AJD) algorithm is proposed, which is based on a natural gradient-type multi-plicative update of the diagonalizing matrix.
Abstract: We present a new algorithm for non-unitary approximate joint diagonalization (AJD), based on a “natural gradient”-type multi-plicative update of the diagonalizing matrix, complemented by step-size optimization at each iteration. The advantages of the new algorithm over existing non-unitary AJD algorithms are in the ability to accommodate non-positive-definite matrices (compared to Pham’s algorithm), in the low computational load per iteration (compared to Yeredor’s AC-DC algorithm), and in the theoretically guaranteed convergence to a true (possibly local) minimum (compared to Ziehe et al.’s FFDiag algorithm).

49 citations


Journal ArticleDOI
TL;DR: The concept of BSS is reviewed and its usefulness in the context of event-related MEG measurements is demonstrated and an additional grouping of the BSS components reveals interesting structure, that could ultimately be used for gaining a better physiological modeling of the data.
Abstract: Recently blind source separation (BSS) methods have been highly successful when applied to biomedical data. This paper reviews the concept of BSS and demonstrates its usefulness in the context of event-related MEG measurements. In a first experiment we apply BSS to artifact identification of raw MEG data and discuss how the quality of the resulting independent component projections can be evaluated. The second part of our study considers averaged data of event-related magnetic fields. Here, it is particularly important to monitor and thus avoid possible overfitting due to limited sample size. A stability assessment of the BSS decomposition allows to solve this task and an additional grouping of the BSS components reveals interesting structure, that could ultimately be used for gaining a better physiological modeling of the data.

48 citations


Proceedings ArticleDOI
23 Aug 2004
TL;DR: A simple selection criterion for hyper-parameters in one-class classifiers (OCCs) is proposed, which makes use of the particular structure of the one- class problem to define the most complex classifier.
Abstract: Model selection in unsupervised learning is a hard problem. In this paper, a simple selection criterion for hyper-parameters in one-class classifiers (OCCs) is proposed. It makes use of the particular structure of the one-class problem. The mean idea is that the complexity of the classifier is increased until the classifier becomes inconsistent on the target class. This defines the most complex classifier, which can still reliably be trained on the data. Experiments indicated the usefulness of the approach.

44 citations


Proceedings ArticleDOI
01 Jan 2004
TL;DR: Two directions in which brain-computer interfacing can be enhanced by exploiting the lateralized readiness potential are presented: for establishing a rapid response BCI system that can predict the laterality of upcoming finger movements before EMG onset even in time critical contexts, and to improve information transfer rates in the common BCI approach relying on imagined limb movements.
Abstract: To enhance human interaction with machines, research interest is growing to develop a 'brain-computer interface', which allows communication of a human with a machine only by use of brain signals So far, the applicability of such an interface is strongly limited by low bit-transfer rates, slow response times and long training sessions for the subject The Berlin Brain-Computer Interface (BBCI) project is guided by the idea to train a computer by advanced machine learning techniques both to improve classification performance and to reduce the need of subject training In this paper we present two directions in which brain-computer interfacing can be enhanced by exploiting the lateralized readiness potential: (1) for establishing a rapid response BCI system that can predict the laterality of upcoming finger movements before EMG onset even in time critical contexts, and (2) to improve information transfer rates in the common BCI approach relying on imagined limb movements

44 citations


Posted Content
TL;DR: In this article, the authors presented two new tools for the identification of faking interviewers in surveys, one based on Benford's Law, and the other exploiting the empirical observation that fakers most often produce answers with less variability than could be expected from the whole survey.
Abstract: This paper presents two new tools for the identification of faking interviewers in surveys. One method is based on Benford's Law, and the other exploits the empirical observation that fakers most often produce answers with less variability than could be expected from the whole survey. We focus on fabricated data, which were taken out of the survey before the data were disseminated in the German Socio-Economic Panel (SOEP). For two samples, the resulting rankings of the interviewers with respect to their cheating behavior are given. For both methods all of the evident fakers are identified.

Journal ArticleDOI
TL;DR: It is underlined that the Fisher kernel should be viewed not as a heuristics but as a powerful statistical tool with well-controlled statistical properties.
Abstract: This letter analyzes the Fisher kernel from a statistical point of view. The Fisher kernel is a particularly interesting method for constructing a model of the posterior probability that makes intelligent use of unlabeled data (i.e., of the underlying data density). It is important to analyze and ultimately understand the statistical properties of the Fisher kernel. To this end, we first establish sufficient conditions that the constructed posterior model is realizable (i.e., it contains the true distribution). Realizability immediately leads to consistency results. Subsequently, we focus on an asymptotic analysis of the generalization error, which elucidates the learning curves of the Fisher kernel and how unlabeled data contribute to learning. We also point out that the squared or log loss is theoretically preferable--because both yield consistent estimators--to other losses such as the exponential loss, when a linear classifier is used together with the Fisher kernel. Therefore, this letter underlines that the Fisher kernel should be viewed not as a heuristics but as a powerful statistical tool with well-controlled statistical properties.

Journal ArticleDOI
TL;DR: This article derives an unbiased estimator of the expected squared error, between SIC and the expected generalization error and proposes determining the degree of regularization of SIC such that the estimators of theexpected squared error is minimized.
Abstract: A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hilbert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results.

Patent
17 Aug 2004
TL;DR: In this article, a method for automatic online detection and classification of anomalous objects in a data stream, especially comprising datasets and / or signals, characterized in that a) the detection of at least one incoming data stream (1000) containing normal and anomalous object, b) automatic construction (2100) of a geometric representation of normality (2200) the incoming objects of the data stream(1000) at a time t1 subject to a predefined optimality condition, especially the construction of a hypersurface enclosing a finite number of normal objects, c) online adaptation
Abstract: The invention is concerned with a method for automatic online detection and classification of anomalous objects in a data stream, especially comprising datasets and / or signals, characterized in that a) the detection of at least one incoming data stream (1000) containing normal and anomalous objects, b) automatic construction (2100) of a geometric representation of normality (2200) the incoming objects of the data stream (1000) at a time t1 subject to at least one predefined optimality condition, especially the construction of a hypersurface enclosing a finite number of normal objects, c) online adaptation of the geometric representation ofnormality (2200) in respect to received at least one received object at a time t2 >= t1 , the adaptation being subject to at least one predefined optimality condition, d) online determination of a normality classification (2300) for received objects at t2 in respect to the geometric representation of normality (2200), e) automatic classification of normal objects and anomalous objects based on the generated normality classification (2300) and generating a data set describing the anomalous data for further processing, especially a visual representation.

Journal ArticleDOI
TL;DR: This work presents a new method that constructively injects noise to assess the reliability and the grouping structure of empirical ICA component estimates and demonstrates that the approach is useful for exploratory data analysis of real-world data.

Book ChapterDOI
22 Sep 2004
TL;DR: This work shows how a simple outlier index can be used directly to solve the ICA problem for super-Gaussian source signals and is outlier-robust by construction and could be used for standard ICA as well as for over-complete ICA.
Abstract: Most ICA algorithms are sensitive to outliers. Instead of robustifying existing algorithms by outlier rejection techniques, we show how a simple outlier index can be used directly to solve the ICA problem for super-Gaussian source signals. This ICA method is outlier-robust by construction and can be used for standard ICA as well as for over-complete ICA (i.e. more source signals than observed signals (mixtures)).





Proceedings Article
01 Jan 2004
TL;DR: This paper proposes regularizing unbiased generalization error estimators for stabilization by trading a small bias in a model selection criterion against a larger variance reduction which has the beneficial effect of being more precise on a single training set.
Abstract: A well-known result by Stein shows that regularized estimators with small bias often yield better estimates than unbiased estimators. In this paper, we adapt this spirit to model selection, and propose regularizing unbiased generalization error estimators for stabilization. We trade a small bias in a model selection criterion against a larger variance reduction which has the beneficial effect of being more precise on a single training set.