scispace - formally typeset
Search or ask a question
Reference EntryDOI

Independent Component Analysis

31 Aug 2012-
TL;DR: A statistical generative model called independent component analysis is discussed, which shows how sparse coding can be interpreted as providing a Bayesian prior, and answers some questions which were not properly answered in the sparse coding framework.
Abstract: Independent component models have gained increasing interest in various fields of applications in recent years. The basic independent component model is a semiparametric model assuming that a p-variate observed random vector is a linear transformation of an unobserved vector of p independent latent variables. This linear transformation is given by an unknown mixing matrix, and one of the main objectives of independent component analysis (ICA) is to estimate an unmixing matrix by means of which the latent variables can be recovered. In this article, we discuss the basic independent component model in detail, define the concepts and analysis tools carefully, and consider two families of ICA estimates. The statistical properties (consistency, asymptotic normality, efficiency, robustness) of the estimates can be analyzed and compared via the so called gain matrices. Some extensions of the basic independent component model, such as models with additive noise or models with dependent observations, are briefly discussed. The article ends with a short example. Keywords: blind source separation; fastICA; independent component model; independent subspace analysis; mixing matrix; overcomplete ICA; undercomplete ICA; unmixing matrix
Citations
More filters
Journal ArticleDOI
TL;DR: The basic theory and applications of ICA are presented, and the goal is to find a linear representation of non-Gaussian data so that the components are statistically independent, or as independent as possible.

8,231 citations


Cites background from "Independent Component Analysis"

  • ...Independent Component Analysis (ICA); see Hyvärinen, Karhunen, and Oja (2001) and Cichocki and Amari (2002) is a novel statistical signal and data analysis method....

    [...]

Book
01 Jan 2009
TL;DR: The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed.
Abstract: Can machine learning deliver AI? Theoretical results, inspiration from the brain and cognition, as well as machine learning experiments suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g. in vision, language, and other AI-level tasks), one would need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers, graphical models with many levels of latent variables, or in complicated propositional formulae re-using many sub-formulae. Each level of the architecture represents features at a different level of abstraction, defined as a composition of lower-level features. Searching the parameter space of deep architectures is a difficult task, but new algorithms have been discovered and a new sub-area has emerged in the machine learning community since 2006, following these discoveries. Learning algorithms such as those for Deep Belief Networks and other related unsupervised learning algorithms have recently been proposed to train deep architectures, yielding exciting results and beating the state-of-the-art in certain areas. Learning Deep Architectures for AI discusses the motivations for and principles of learning algorithms for deep architectures. By analyzing and comparing recent results with different learning algorithms for deep architectures, explanations for their success are proposed and discussed, highlighting challenges and suggesting avenues for future explorations in this area.

7,767 citations

Journal ArticleDOI
TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.
Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

6,527 citations

Journal ArticleDOI
TL;DR: An integrated approach to probabilistic independent component analysis for functional MRI (FMRI) data that allows for nonsquare mixing in the presence of Gaussian noise is presented and compared to the spatio-temporal accuracy of results obtained from classical ICA and GLM analyses.
Abstract: We present an integrated approach to probabilistic independent component analysis (ICA) for functional MRI (FMRI) data that allows for nonsquare mixing in the presence of Gaussian noise. In order to avoid overfitting, we employ objective estimation of the amount of Gaussian noise through Bayesian analysis of the true dimensionality of the data, i.e., the number of activation and non-Gaussian noise sources. This enables us to carry out probabilistic modeling and achieves an asymptotically unique decomposition of the data. It reduces problems of interpretation, as each final independent component is now much more likely to be due to only one physical or physiological process. We also describe other improvements to standard ICA, such as temporal prewhitening and variance normalization of timeseries, the latter being particularly useful in the context of dimensionality reduction when weak activation is present. We discuss the use of prior information about the spatiotemporal nature of the source processes, and an alternative-hypothesis testing approach for inference, using Gaussian mixture models. The performance of our approach is illustrated and evaluated on real and artificial FMRI data, and compared to the spatio-temporal accuracy of results obtained from classical ICA and GLM analyses.

2,597 citations


Cites background or methods from "Independent Component Analysis"

  • ...In order to optimise for non-Gaussian source estimates, [23] propose the following contrast function:...

    [...]

  • ...At the second stage the source signals are estimated within the lower- dimensional signal + noise sub–space using a fixed-point iteration scheme [23] that maximises the non-Gaussianity of the source estimates....

    [...]

  • ...A proof of convergence and discussion about the choice of the non- linear function can be found in [23]....

    [...]

  • ...Earlier work [41] characterised the multivariate normal distribution through the non-uniqueness of its linear structure, a result which within the ICA literature has been restated as the limitation that only one Gaussian source process, at most, may contribute to the observations for the ICA model to be estimable [15, 23]....

    [...]

  • ...[23] have presented an elegant fixed point algorithm that uses approximations to neg-entropy in order to optimise for non-Gaussian source distributions and give a clear account of the relation between this approach to statistical independence....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: An efficient algorithm is proposed, which allows the computation of the ICA of a data matrix within a polynomial time and may actually be seen as an extension of the principal component analysis (PCA).

8,522 citations

Journal ArticleDOI
TL;DR: Using maximum entropy approximations of differential entropy, a family of new contrast (objective) functions for ICA enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions.
Abstract: Independent component analysis (ICA) is a statistical method for transforming an observed multidimensional random vector into components that are statistically as independent from each other as possible. We use a combination of two different approaches for linear ICA: Comon's information theoretic approach and the projection pursuit approach. Using maximum entropy approximations of differential entropy, we introduce a family of new contrast functions for ICA. These contrast functions enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions. The statistical properties of the estimators based on such contrast functions are analyzed under the assumption of the linear mixture model, and it is shown how to choose contrast functions that are robust and/or of minimum variance. Finally, we introduce simple fixed-point algorithms for practical optimization of the contrast functions.

6,144 citations

Journal ArticleDOI
TL;DR: A novel fast algorithm for independent component analysis is introduced, which can be used for blind source separation and feature extraction, and the convergence speed is shown to be cubic.
Abstract: We introduce a novel fast algorithm for independent component analysis, which can be used for blind source separation and feature extraction. We show how a neural network learning rule can be transformed into a fixedpoint iteration, which provides an algorithm that is very simple, does not depend on any user-defined parameters, and is fast to converge to the most accurate solution allowed by the data. The algorithm finds, one at a time, all nongaussian independent components, regardless of their probability distributions. The computations can be performed in either batch mode or a semiadaptive manner. The convergence of the algorithm is rigorously proved, and the convergence speed is shown to be cubic. Some comparisons to gradient-based algorithms are made, showing that the new algorithm is usually 10 to 100 times faster, sometimes giving the solution in just a few iterations.

3,215 citations

Journal ArticleDOI
01 Dec 1993
TL;DR: In this paper, a computationally efficient technique for blind estimation of directional vectors, based on joint diagonalization of fourth-order cumulant matrices, is presented for beamforming.
Abstract: The paper considers an application of blind identification to beamforming. The key point is to use estimates of directional vectors rather than resort to their hypothesised value. By using estimates of the directional vectors obtained via blind identification, i.e. without knowing the array manifold, beamforming is made robust with respect to array deformations, distortion of the wave front, pointing errors etc., so that neither array calibration nor physical modelling is necessary. Rather suprisingly, ‘blind beamformers’ may outperform ‘informed beamformers’ in a plausible range of parameters, even when the array is perfectly known to the informed beamformer. The key assumption on which blind identification relies is the statistical independence of the sources, which is exploited using fourth-order cumulants. A computationally efficient technique is presented for the blind estimation of directional vectors, based on joint diagonalisation of fourth-order cumulant matrices; its implementation is described, and its performance is investigated by numerical experiments.

2,851 citations

Journal ArticleDOI
TL;DR: A new source separation technique exploiting the time coherence of the source signals is introduced, which relies only on stationary second-order statistics that are based on a joint diagonalization of a set of covariance matrices.
Abstract: Separation of sources consists of recovering a set of signals of which only instantaneous linear mixtures are observed. In many situations, no a priori information on the mixing matrix is available: The linear mixture should be "blindly" processed. This typically occurs in narrowband array processing applications when the array manifold is unknown or distorted. This paper introduces a new source separation technique exploiting the time coherence of the source signals. In contrast with other previously reported techniques, the proposed approach relies only on stationary second-order statistics that are based on a joint diagonalization of a set of covariance matrices. Asymptotic performance analysis of this method is carried out; some numerical simulations are provided to illustrate the effectiveness of the proposed method.

2,721 citations