scispace - formally typeset
Search or ask a question

Showing papers by "Klaus-Robert Müller published in 2006"


Journal ArticleDOI
19 Jun 2006
TL;DR: The third BCI Competition to address several of the most difficult and important analysis problems in BCI research is organized and the paper describes the data sets that were provided to the competitors and gives an overview of the results.
Abstract: A brain-computer interface (BCI) is a system that allows its users to control external devices with brain activity. Although the proof-of-concept was given decades ago, the reliable translation of user intent into device control commands is still a major challenge. Success requires the effective interaction of two adaptive controllers: the user's brain, which produces brain activity that encodes intent, and the BCI system, which translates that activity into device control commands. In order to facilitate this interaction, many laboratories are exploring a variety of signal analysis techniques to improve the adaptation of the BCI system to the user. In the literature, many machine learning and pattern classification algorithms have been reported to give impressive results when applied to BCI data in offline analyses. However, it is more difficult to evaluate their relative value for actual online use. BCI data competitions have been organized to provide objective formal evaluations of alternative methods. Prompted by the great interest in the first two BCI Competitions, we organized the third BCI Competition to address several of the most difficult and important analysis problems in BCI research. The paper describes the data sets that were provided to the competitors and gives an overview of the results.

814 citations


Journal ArticleDOI
TL;DR: This study shows that the brain signals used for control can change substantially from the offline calibration sessions to online control, and also within a single session, and proposes several adaptive classification schemes and study their performance on data recorded during online experiments.
Abstract: Non-stationarities are ubiquitous in EEG signals. They are especially apparent in the use of EEG-based brain–computer interfaces (BCIs): (a) in the differences between the initial calibration measurement and the online operation of a BCI, or (b) caused by changes in the subject's brain processes during an experiment (e.g. due to fatigue, change of task involvement, etc). In this paper, we quantify for the first time such systematic evidence of statistical differences in data recorded during offline and online sessions. Furthermore, we propose novel techniques of investigating and visualizing data distributions, which are particularly useful for the analysis of (non-)stationarities. Our study shows that the brain signals used for control can change substantially from the offline calibration sessions to online control, and also within a single session. In addition to this general characterization of the signals, we propose several adaptive classification schemes and study their performance on data recorded during online experiments. An encouraging result of our study is that surprisingly simple adaptive methods in combination with an offline feature selection scheme can significantly increase BCI performance.

436 citations


Journal ArticleDOI
TL;DR: A novel technique that allows the simultaneous optimization of a spatial and a spectral filter enhancing discriminability rates of multichannel EEG single-trials is presented.
Abstract: Brain-computer interface (BCI) systems create a novel communication channel from the brain to an output device by bypassing conventional motor output pathways of nerves and muscles. Therefore they could provide a new communication and control option for paralyzed patients. Modern BCI technology is essentially based on techniques for the classification of single-trial brain signals. Here we present a novel technique that allows the simultaneous optimization of a spatial and a spectral filter enhancing discriminability rates of multichannel EEG single-trials. The evaluation of 60 experiments involving 22 different subjects demonstrates the significant superiority of the proposed algorithm over to its classical counterpart: the median classification error rate was decreased by 11%. Apart from the enhanced classification, the spatial and/or the spectral filter that are determined by the algorithm can also be used for further analysis of the data, e.g., for source localization of the respective brain rhythms

373 citations


Journal Article
TL;DR: A new design of storage and numerical operations is proposed, which speeds up the training of an incremental SVM by a factor of 5 to 20, and various applications of the algorithm can be foreseen.
Abstract: Incremental Support Vector Machines (SVM) are instrumental in practical applications of online learning. This work focuses on the design and analysis of efficient incremental SVM learning, with the aim of providing a fast, numerically stable and robust implementation. A detailed analysis of convergence and of algorithmic complexity of incremental SVM learning is carried out. Based on this analysis, a new design of storage and numerical operations is proposed, which speeds up the training of an incremental SVM by a factor of 5 to 20. The performance of the new algorithm is demonstrated in two scenarios: learning with limited resources and active learning. Various applications of the algorithm, such as in drug discovery, online monitoring of industrial devices and and surveillance of network traffic, can be foreseen.

369 citations


Journal ArticleDOI
19 Jun 2006
TL;DR: Experiments demonstrate that very high information transfer rates can be achieved using the readiness potential when predicting the laterality of upcoming left- versus right-hand movements in healthy subjects, and are encouraging for an EEG-based BCI system in untrained subjects that is independent of peripheral nervous system activity.
Abstract: The Berlin Brain-Computer Interface (BBCI) project develops a noninvasive BCI system whose key features are 1) the use of well-established motor competences as control paradigms, 2) high-dimensional features from 128-channel electroencephalogram (EEG), and 3) advanced machine learning techniques. As reported earlier, our experiments demonstrate that very high information transfer rates can be achieved using the readiness potential (RP) when predicting the laterality of upcoming left- versus right-hand movements in healthy subjects. A more recent study showed that the RP similarly accompanies phantom movements in arm amputees, but the signal strength decreases with longer loss of the limb. In a complementary approach, oscillatory features are used to discriminate imagined movements (left hand versus right hand versus foot). In a recent feedback study with six healthy subjects with no or very little experience with BCI control, three subjects achieved an information transfer rate above 35 bits per minute (bpm), and further two subjects above 24 and 15 bpm, while one subject could not achieve any BCI control. These results are encouraging for an EEG-based BCI system in untrained subjects that is independent of peripheral nervous system activity and does not rely on evoked potentials even when compared to results with very well-trained subjects operating other BCI systems.

335 citations


Journal ArticleDOI
19 Jun 2006
TL;DR: The issues discussed include a taxonomy of methods and applications, time-frequency spatial analysis, optimization schemes, the role of insight in analysis, adaptation, and methods for quantifying BCI feedback.
Abstract: This paper describes the outcome of discussions held during the Third International BCI Meeting at a workshop charged with reviewing and evaluating the current state of and issues relevant to brain-computer interface (BCI) feature extraction and translation. The issues discussed include a taxonomy of methods and applications, time-frequency spatial analysis, optimization schemes, the role of insight in analysis, adaptation, and methods for quantifying BCI feedback.

217 citations


Proceedings ArticleDOI
09 Jul 2006
TL;DR: A coding scheme utilizing an H.264/MPEG4-AVC codec to handle the specific requirements of multi-view datasets, namely temporal and inter-view correlation, and two main features of the coder are used: hierarchical B pictures for temporal dependencies and an adapted prediction scheme to exploit inter-View dependencies.
Abstract: Efficient Multi-view coding requires coding algorithms that exploit temporal, as well as inter-view dependencies between adjacent cameras. Based on a spatiotemporal analysis on the multi-view data set, we present a coding scheme utilizing an H.264/MPEG4-AVC codec. To handle the specific requirements of multi-view datasets, namely temporal and inter-view correlation, two main features of the coder are used: hierarchical B pictures for temporal dependencies and an adapted prediction scheme to exploit inter-view dependencies. Both features are set up in the H.264/MPEG4-AVC configuration file, such that coding and decoding is purely based on standardized software. Additionally, picture reordering before coding to optimize coding efficiency and inverse reordering after decoding to obtain individual views are applied. Finally, coding results are shown for the proposed multi-view coder and compared to simulcast anchor and simulcast hierarchical B picture coding.

173 citations


01 Jan 2006
TL;DR: A novel typewriter application ‘Hex-o-Spell’ that is specifically tailored to the characteristics of direct brain-to-computer interaction that was developed within the Berlin BrainComputer Interface project in cooperation with specialists for Human Computer Interaction.
Abstract: SUMMARY: We present a novel typewriter application ‘Hex-o-Spell’ that is specifically tailored to the characteristics of direct brain-to-computer interaction. The high bandwidth at which a user may perceive information from the display is used in an appealing visualization based on hexagons. On the other hand the control of the application is possible at low bandwidth using only two control commands (mental states) and is relatively stable against delays and the like. The eectiveness and robustness of the interface was demonstrated at the CeBIT 2006 (world’s largest IT fair) where two subjects operated the mental typewriter at a speed of up to 7.6 char/min. It was developed within the Berlin BrainComputer Interface project in cooperation with specialists for Human Computer Interaction.

137 citations



Journal ArticleDOI
TL;DR: Simple and fast methods based on nearest neighbors that order objects from high-dimensional data sets from typical points to untypical points allow us to detect outliers with a performance comparable to or better than other often much more sophisticated methods.

102 citations


Proceedings Article
04 Dec 2006
TL;DR: A novel framework for the classification of single trial ElectroEncephaloGraphy (EEG), based on regularized logistic regression, which compares favorably against conventional CSP based classifiers.
Abstract: We propose a novel framework for the classification of single trial ElectroEncephaloGraphy (EEG), based on regularized logistic regression. Framed in this robust statistical framework no prior feature extraction or outlier removal is required. We present two variations of parameterizing the regression function: (a) with a full rank symmetric matrix coefficient and (b) as a difference of two rank=1 matrices. In the first case, the problem is convex and the logistic regression is optimal under a generative model. The latter case is shown to be related to the Common Spatial Pattern (CSP) algorithm, which is a popular technique in Brain Computer Interfacing. The regression coefficients can also be topographically mapped onto the scalp similarly to CSP projections, which allows neuro-physiological interpretation. Simulations on 162 BCI datasets demonstrate that classification accuracy and robustness compares favorably against conventional CSP based classifiers.

Journal ArticleDOI
TL;DR: This article proposes a new linear method to identify the "non-Gaussian subspace" within a very general semi-parametric framework based on a linear operator which, to any arbitrary nonlinear (smooth) function, associates a vector belonging to the low dimensional non- Gaussian target subspace, up to an estimation error.
Abstract: Finding non-Gaussian components of high-dimensional data is an important preprocessing step for efficient information processing. This article proposes a new linear method to identify the "non-Gaussian subspace" within a very general semi-parametric framework. Our proposed method, called NGCA (non-Gaussian component analysis), is based on a linear operator which, to any arbitrary nonlinear (smooth) function, associates a vector belonging to the low dimensional non-Gaussian target subspace, up to an estimation error. By applying this operator to a family of different nonlinear functions, one obtains a family of different vectors lying in a vicinity of the target space. As a final step, the target space itself is estimated by applying PCA to this family of vectors. We show that this procedure is consistent in the sense that the estimaton error tends to zero at a parametric rate, uniformly over the family, Numerical examples demonstrate the usefulness of our method.

Journal ArticleDOI
TL;DR: A method is developed that uses prior information about the phase-locking property of event-related potentials in a regularization framework to bias a blind source separation algorithm toward an improved separation of single-trial phase-locked responses in terms of an increased signal-to-noise ratio.
Abstract: When decomposing single trial electroencephalography it is a challenge to incorporate prior physiological knowledge. Here, we develop a method that uses prior information about the phase-locking property of event-related potentials in a regularization framework to bias a blind source separation algorithm toward an improved separation of single-trial phase-locked responses in terms of an increased signal-to-noise ratio. In particular, we suggest a transformation of the data, using weighted average of the single trial and trial-averaged response, that redirects the focus of source separation methods onto the subspace of event-related potentials. The practical benefit with respect to an improved separation of such components from ongoing background activity and extraneous noise is first illustrated on artificial data and finally verified in a real-world application of extracting single-trial somatosensory evoked potentials from multichannel EEG-recordings.

Proceedings Article
04 Dec 2006
TL;DR: A new paradigm is proposed that allows to completely omit such calibration and instead transfer knowledge from prior sessions and construct a classifier based on individualized prototypes and shows that classifiers can be successfully transferred to a new session for a number of subjects.
Abstract: Up to now even subjects that are experts in the use of machine learning based BCI systems still have to undergo a calibration session of about 20-30 min. From this data their (movement) intentions are so far infered. We now propose a new paradigm that allows to completely omit such calibration and instead transfer knowledge from prior sessions. To achieve this goal we first define normalized CSP features and distances in-between. Second, we derive prototypical features across sessions: (a) by clustering or (b) by feature concatenation methods. Finally, we construct a classifier based on these individualized prototypes and show that, indeed, classifiers can be successfully transferred to a new session for a number of subjects.

Journal ArticleDOI
TL;DR: This work shows by systematic modeling of non-Euclidean pairwise data that there exists metric violations which can carry valuable problem specific information and Euclidean and non-metric data can be unified on the level of structural information contained in the data.

Journal ArticleDOI
TL;DR: A blind source separation technique is proposed that diagonalizes antisymmetrized cross-correlation or cross-spectral matrices and the resulting decomposition finds truly interacting subsystems blindly and suppresses any spurious interaction stemming from the mixture.
Abstract: We present a technique that identifies truly interacting subsystems of a complex system from multichannel data if the recordings are an unknown linear and instantaneous mixture of the true sources. The method is valid for arbitrary noise structure. For this, a blind source separation technique is proposed that diagonalizes antisymmetrized cross-correlation or cross-spectral matrices. The resulting decomposition finds truly interacting subsystems blindly and suppresses any spurious interaction stemming from the mixture. The usefulness of this interacting source analysis is demonstrated in simulations and for real electroencephalography data.

Patent
08 Dec 2006
TL;DR: In this paper, the authors present a method and an apparatus for automatic comparison of at least two data sequences characterized in: an evaluation of a local relationship between any pair of subsequences in two or more sequences; and a global relationship by means of aggregation of the evaluations of said local relationships.
Abstract: The invention is concerned with a method and an apparatus for automatic comparison of at least two data sequences characterized in: an evaluation of a local relationship between any pair of subsequences in two or more sequences; an evaluation of a global relationship by means of aggregation of the evaluations of said local relationships.

01 Jan 2006
TL;DR: In this paper, the supplement of prior knowledge about joint angle configurations in the scope of 3D human pose tracking is considered for a nonparametric Parzen density estimation in the 12-dimensional joint configuration space.
Abstract: The present paper considers the supplement of prior knowledge about joint angle configurations in the scope of 3-D human pose tracking. Training samples obtained from an industrial marker based tracking system are used for a nonparametric Parzen density estimation in the 12-dimensional joint configuration space. These learned probability densities constrain the image-driven joint angle estimates by drawing solutions towards familiar configurations. This prevents the method from producing unrealistic pose estimates due to unreliable image cues. Experiments on sequences with a human leg model reveal a considerably increased robustness, particularly in the presence of disturbed images and occlusions.

Book ChapterDOI
12 Sep 2006
TL;DR: This work proposes a novel spectral filter optimization algorithm for the single trial ElectroEncephaloGraphy (EEG) classification problem, and shows how a prior knowledge can drastically improve the classification or only be misleading.
Abstract: We propose a novel spectral filter optimization algorithm for the single trial ElectroEncephaloGraphy (EEG) classification problem. The algorithm is designed to improve the classification accuracy of Common Spatial Pattern (CSP) based classifiers. The algorithm is based on a simple statistical criterion, and allows the user to incorporate any prior information one has about the spectrum of the signal. We show that with a different preprocessing, how a prior knowledge can drastically improve the classification or only be misleading. We also show a generalization of the CSP algorithm so that the CSP spatial projection can be recalculated after the optimization of the spectral filter. This leads to an iterative procedure of spectral and spatial filter update that further improves the classification accuracy, not only by imposing a spectral filter but also by choosing a better spatial projection.

Book ChapterDOI
12 Sep 2006
TL;DR: This paper proposes a new method called importance-weighted cross-validation, which is still unbiased even under the covariate shift, and successfully tested on toy data and furthermore demonstrated in the brain-computer interface, where strong non-stationarity effects can be seen between calibration and feedback sessions.
Abstract: A common assumption in supervised learning is that the input points in the training set follow the same probability distribution as the input points used for testing However, this assumption is not satisfied, for example, when the outside of training region is extrapolated The situation where the training input points and test input points follow different distributions is called the covariate shift Under the covariate shift, standard machine learning techniques such as empirical risk minimization or cross-validation do not work well since their unbiasedness is no longer maintained In this paper, we propose a new method called importance-weighted cross-validation, which is still unbiased even under the covariate shift The usefulness of our proposed method is successfully tested on toy data and furthermore demonstrated in the brain-computer interface, where strong non-stationarity effects can be seen between calibration and feedback sessions

Book ChapterDOI
12 Sep 2006
TL;DR: This contribution addresses the efficient computation of distance functions and similarity coefficients for sequential data by utilizing different data structures for efficient computation and yielding a runtime linear in the sequence length.
Abstract: Kernel functions as similarity measures for sequential data have been extensively studied in previous research. This contribution addresses the efficient computation of distance functions and similarity coefficients for sequential data. Two proposed algorithms utilize different data structures for efficient computation and yield a runtime linear in the sequence length. Experiments on network data for intrusion detection suggest the importance of distances and even non-metric similarity measures for sequential data.

Proceedings ArticleDOI
01 Oct 2006
TL;DR: An RD-optimized mesh coder that includes different prediction modes as well as an RD cost computation that controls the mode selection across all possible spatial partitions of a mesh to find the clustering structure together with the associated prediction modes is presented.
Abstract: Recent developments in the compression of dynamic meshes or mesh sequences have shown that the statistical dependencies within a mesh sequence can be exploited well by predictive coding approaches. Coders introduced so far use experimentally determined or heuristic thresholds for tuning the algorithms. In video coding rate-distortion (RD) optimization is often used to avoid fixing of thresholds and to select a coding mode. We applied these ideas and present here an RD-optimized mesh coder. It includes different prediction modes as well as an RD cost computation that controls the mode selection across all possible spatial partitions of a mesh to find the clustering structure together with the associated prediction modes. The structure of the RD-optimized D3DMC coder is presented, followed by comparative results with mesh sequences at different resolutions.

Proceedings Article
04 Dec 2006
TL;DR: It is shown that the relevant information about a classification problem in feature space is contained up to negligible error in a finite number of leading kernel PCA components if the kernel matches the underlying learning problem.
Abstract: We show that the relevant information about a classification problem in feature space is contained up to negligible error in a finite number of leading kernel PCA components if the kernel matches the underlying learning problem. Thus, kernels not only transform data sets such that good generalization can be achieved even by linear discriminant functions, but this transformation is also performed in a manner which makes economic use of feature space dimensions. In the best case, kernels provide efficient implicit representations of the data to perform classification. Practically, we propose an algorithm which enables us to recover the subspace and dimensionality relevant for good classification. Our algorithm can therefore be applied (1) to analyze the interplay of data set and kernel in a geometric fashion, (2) to help in model selection, and to (3) de-noise in feature space in order to yield better classification results.

Proceedings Article
04 Dec 2006
TL;DR: There may not be a strict dichotomy between either a metric or a non-metric internal space but rather degrees to which potentially large subsets of stimuli are represented metrically with a small subset causing a global violation of metricity.
Abstract: Attempting to model human categorization and similarity judgements is both a very interesting but also an exceedingly difficult challenge. Some of the difficulty arises because of conflicting evidence whether human categorization and similarity judgements should or should not be modelled as to operate on a mental representation that is essentially metric. Intuitively, this has a strong appeal as it would allow (dis)similarity to be represented geometrically as distance in some internal space. Here we show how a single stimulus, carefully constructed in a psychophysical experiment, introduces l2 violations in what used to be an internal similarity space that could be adequately modelled as Euclidean. We term this one influential data point a conflictual judgement. We present an algorithm of how to analyse such data and how to identify the crucial point. Thus there may not be a strict dichotomy between either a metric or a non-metric internal space but rather degrees to which potentially large subsets of stimuli are represented metrically with a small subset causing a global violation of metricity.

Book ChapterDOI
01 Jan 2006
TL;DR: A particular focus is placed on linear classification methods which can be applied in the BCI context and an overview on the Berlin-Brain Computer Interface (BBCI) is provided.
Abstract: This paper discusses machine learning methods and their application to Brain-Computer Interfacing. A particular focus is placed on linear classification methods which can be applied in the BCI context. Finally, we provide an overview on the Berlin-Brain Computer Interface (BBCI).

Book ChapterDOI
10 Sep 2006
TL;DR: This paper proposes an extension of the Singular Information Criterion to apply it to many singular machines, and evaluate the efficiency in Gaussian mixtures, which offers an effective strategy to select the optimal size.
Abstract: To decide the optimal size of learning machines is a central issue in the statistical learning theory, and that is why some theoretical criteria such as the BIC are developed. However, they cannot be applied to singular machines, and it is known that many practical learning machines e.g. mixture models, hidden Markov models, and Bayesian networks, are singular. Recently, we proposed the Singular Information Criterion (SingIC), which allows us to select the optimal size of singular machines. The SingIC is based on the analysis of the learning coefficient. So, the machines, to which the SingIC can be applied, are still limited. In this paper, we propose an extension of this criterion, which enables us to apply it to many singular machines, and evaluate the efficiency in Gaussian mixtures. The results offer an effective strategy to select the optimal size.

Journal Article
TL;DR: In this article, the authors proposed an extension of the singular information criterion (SingIC) to many singular machines and evaluated the efficiency in Gaussian mixtures, and the results offer an effective strategy to select the optimal size of singular machines.
Abstract: To decide the optimal size of learning machines is a central issue in the statistical learning theory, and that is why some theoretical criteria such as the BIC are developed. However, they cannot be applied to singular machines, and it is known that many practical learning machines e.g. mixture models, hidden Markov models, and Bayesian networks, are singular. Recently, we proposed the Singular Information Criterion (SingIC), which allows us to select the optimal size of singular machines. The SingIC is based on the analysis of the learning coefficient. So, the machines, to which the SingIC can be applied, are still limited. In this paper, we propose an extension of this criterion, which enables us to apply it to many singular machines, and evaluate the efficiency in Gaussian mixtures. The results offer an effective strategy to select the optimal size.

Proceedings ArticleDOI
14 May 2006
TL;DR: A method is given for obtaining BLUE without the prior knowledge of the subspace to which the true signal belongs and the noise covariance matrix and the additional assumption is that thetrue signal follows a non-Gaussian distribution while the noise is Gaussian.
Abstract: Obtaining the best linear unbiased estimator (BLUE) of noisy signals is a traditional but powerful approach to noise reduction Explicitly computing BLUE usually requires the prior knowledge of the subspace to which the true signal belongs and the noise covariance matrix. However, such prior knowledge is often unavailable in reality, which prevents us from applying BLUE to real-world problems. In this paper, we therefore give a method for obtaining BLUE without such prior knowledge. Our additional assumption is that the true signal follows a non-Gaussian distribution while the noise is Gaussian.

Book
01 Oct 2006
TL;DR: This book shows the new habit that, actually it's a very old habit to do that can make your life more qualified and change your habit to hang or waste the time to only chat with your friends.
Abstract: Change your habit to hang or waste the time to only chat with your friends. It is done by your everyday, don't you feel bored? Now, we will show you the new habit that, actually it's a very old habit to do that can make your life more qualified. When feeling bored of always chatting with your friends all free time, you can find the book enPDF pattern recognition 28th dagm symposium berlin germany september 12 14 2006 proceedings 1st edi and then read it.

Proceedings ArticleDOI
01 Jan 2006
TL;DR: A particular focus is placed on linear classification methods which can be applied in the BCI context and an overview of the Berlin-Brain Computer Interface (BBCI) is provided.
Abstract: This paper discusses machine learning methods and their application to Brain-Computer Interfacing. A partic- ular focus is placed on linear classification methods which can be applied in the BCI context. Finally, we provide an overview of the Berlin-Brain Computer Interface (BBCI).