Independent Component Analysis

doi:10.1002/9780470057339.VNN086

Home
/
Papers
/
Independent Component Analysis

Reference Entry•DOI•

Independent Component Analysis

Hannu Oja¹, Klaus Nordhausen¹•Institutions (1)

University of Tampere¹

31 Aug 2012-

TL;DR: A statistical generative model called independent component analysis is discussed, which shows how sparse coding can be interpreted as providing a Bayesian prior, and answers some questions which were not properly answered in the sparse coding framework.

read less

Abstract: Independent component models have gained increasing interest in various fields of applications in recent years. The basic independent component model is a semiparametric model assuming that a p-variate observed random vector is a linear transformation of an unobserved vector of p independent latent variables. This linear transformation is given by an unknown mixing matrix, and one of the main objectives of independent component analysis (ICA) is to estimate an unmixing matrix by means of which the latent variables can be recovered. In this article, we discuss the basic independent component model in detail, define the concepts and analysis tools carefully, and consider two families of ICA estimates. The statistical properties (consistency, asymptotic normality, efficiency, robustness) of the estimates can be analyzed and compared via the so called gain matrices. Some extensions of the basic independent component model, such as models with additive noise or models with dependent observations, are briefly discussed. The article ends with a short example. Keywords: blind source separation; fastICA; independent component model; independent subspace analysis; mixing matrix; overcomplete ICA; undercomplete ICA; unmixing matrix

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Independent component analysis: algorithms and applications

[...]

Aapo Hyvärinen¹, Erkki Oja¹•Institutions (1)

Helsinki University of Technology¹

01 May 2000-Neural Networks

TL;DR: The basic theory and applications of ICA are presented, and the goal is to find a linear representation of non-Gaussian data so that the components are statistically independent, or as independent as possible.

...read moreread less

8,231 citations

Cites background from "Independent Component Analysis"

...Independent Component Analysis (ICA); see Hyvärinen, Karhunen, and Oja (2001) and Cichocki and Amari (2002) is a novel statistical signal and data analysis method....
[...]

Book•

Learning Deep Architectures for AI

[...]

Yoshua Bengio¹•Institutions (1)

Université de Montréal¹

01 Jan 2009

TL;DR: The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed.

...read moreread less

Abstract: Can machine learning deliver AI? Theoretical results, inspiration from the brain and cognition, as well as machine learning experiments suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g. in vision, language, and other AI-level tasks), one would need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers, graphical models with many levels of latent variables, or in complicated propositional formulae re-using many sub-formulae. Each level of the architecture represents features at a different level of abstraction, defined as a composition of lower-level features. Searching the parameter space of deep architectures is a difficult task, but new algorithms have been discovered and a new sub-area has emerged in the machine learning community since 2006, following these discoveries. Learning algorithms such as those for Deep Belief Networks and other related unsupervised learning algorithms have recently been proposed to train deep architectures, yielding exciting results and beating the state-of-the-art in certain areas. Learning Deep Architectures for AI discusses the motivations for and principles of learning algorithms for deep architectures. By analyzing and comparing recent results with different learning algorithms for deep architectures, explanations for their success are proposed and discussed, highlighting challenges and suggesting avenues for future explorations in this area.

...read moreread less

7,767 citations

Journal Article•DOI•

Statistical pattern recognition: a review

[...]

Anil K. Jain¹, Robert P. W. Duin², Jianchang Mao³•Institutions (3)

Michigan State University¹, Delft University of Technology², IBM³

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

...read moreread less

Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

...read moreread less

6,527 citations

Modern Applied Statistics with S Fourth edition

[...]

W. N. Venables, Brian D. Ripley

01 Jan 2002

2,894 citations

Journal Article•DOI•

Probabilistic independent component analysis for functional magnetic resonance imaging

[...]

Christian F. Beckmann¹, Stephen M. Smith¹•Institutions (1)

University of Oxford¹

06 Feb 2004-IEEE Transactions on Medical Imaging

TL;DR: An integrated approach to probabilistic independent component analysis for functional MRI (FMRI) data that allows for nonsquare mixing in the presence of Gaussian noise is presented and compared to the spatio-temporal accuracy of results obtained from classical ICA and GLM analyses.

...read moreread less

Abstract: We present an integrated approach to probabilistic independent component analysis (ICA) for functional MRI (FMRI) data that allows for nonsquare mixing in the presence of Gaussian noise. In order to avoid overfitting, we employ objective estimation of the amount of Gaussian noise through Bayesian analysis of the true dimensionality of the data, i.e., the number of activation and non-Gaussian noise sources. This enables us to carry out probabilistic modeling and achieves an asymptotically unique decomposition of the data. It reduces problems of interpretation, as each final independent component is now much more likely to be due to only one physical or physiological process. We also describe other improvements to standard ICA, such as temporal prewhitening and variance normalization of timeseries, the latter being particularly useful in the context of dimensionality reduction when weak activation is present. We discuss the use of prior information about the spatiotemporal nature of the source processes, and an alternative-hypothesis testing approach for inference, using Gaussian mixture models. The performance of our approach is illustrated and evaluated on real and artificial FMRI data, and compared to the spatio-temporal accuracy of results obtained from classical ICA and GLM analyses.

...read moreread less

2,597 citations

Cites background or methods from "Independent Component Analysis"

...In order to optimise for non-Gaussian source estimates, [23] propose the following contrast function:...
[...]
...At the second stage the source signals are estimated within the lower- dimensional signal + noise sub–space using a fixed-point iteration scheme [23] that maximises the non-Gaussianity of the source estimates....
[...]
...A proof of convergence and discussion about the choice of the non- linear function can be found in [23]....
[...]
...Earlier work [41] characterised the multivariate normal distribution through the non-uniqueness of its linear structure, a result which within the ICA literature has been restated as the limitation that only one Gaussian source process, at most, may contribute to the observations for the ICA model to be estimable [15, 23]....
[...]
...[23] have presented an elegant fixed point algorithm that uses approximations to neg-entropy in order to optimise for non-Gaussian source distributions and give a clear account of the relation between this approach to statistical independence....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Independent component analysis, a new concept?

[...]

Pierre Comon

01 Apr 1994-Signal Processing

TL;DR: An efficient algorithm is proposed, which allows the computation of the ICA of a data matrix within a polynomial time and may actually be seen as an extension of the principal component analysis (PCA).

...read moreread less

8,522 citations

Journal Article•DOI•

Fast and robust fixed-point algorithms for independent component analysis

[...]

Aapo Hyvärinen¹•Institutions (1)

University of Helsinki¹

01 May 1999-IEEE Transactions on Neural Networks

TL;DR: Using maximum entropy approximations of differential entropy, a family of new contrast (objective) functions for ICA enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions.

...read moreread less

Abstract: Independent component analysis (ICA) is a statistical method for transforming an observed multidimensional random vector into components that are statistically as independent from each other as possible. We use a combination of two different approaches for linear ICA: Comon's information theoretic approach and the projection pursuit approach. Using maximum entropy approximations of differential entropy, we introduce a family of new contrast functions for ICA. These contrast functions enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions. The statistical properties of the estimators based on such contrast functions are analyzed under the assumption of the linear mixture model, and it is shown how to choose contrast functions that are robust and/or of minimum variance. Finally, we introduce simple fixed-point algorithms for practical optimization of the contrast functions.

...read moreread less

6,144 citations

Journal Article•DOI•

A fast fixed-point algorithm for independent component analysis

[...]

Aapo Hyvärinen¹, Erkki Oja¹•Institutions (1)

Helsinki University of Technology¹

01 Oct 1997-Neural Computation

TL;DR: A novel fast algorithm for independent component analysis is introduced, which can be used for blind source separation and feature extraction, and the convergence speed is shown to be cubic.

...read moreread less

Abstract: We introduce a novel fast algorithm for independent component analysis, which can be used for blind source separation and feature extraction. We show how a neural network learning rule can be transformed into a fixedpoint iteration, which provides an algorithm that is very simple, does not depend on any user-defined parameters, and is fast to converge to the most accurate solution allowed by the data. The algorithm finds, one at a time, all nongaussian independent components, regardless of their probability distributions. The computations can be performed in either batch mode or a semiadaptive manner. The convergence of the algorithm is rigorously proved, and the convergence speed is shown to be cubic. Some comparisons to gradient-based algorithms are made, showing that the new algorithm is usually 10 to 100 times faster, sometimes giving the solution in just a few iterations.

...read moreread less

3,215 citations

Journal Article•DOI•

Blind beamforming for non-gaussian signals

[...]

Jean-François Cardoso¹, Antoine Souloumiac¹•Institutions (1)

Télécom ParisTech¹

01 Dec 1993

TL;DR: In this paper, a computationally efficient technique for blind estimation of directional vectors, based on joint diagonalization of fourth-order cumulant matrices, is presented for beamforming.

...read moreread less

Abstract: The paper considers an application of blind identification to beamforming. The key point is to use estimates of directional vectors rather than resort to their hypothesised value. By using estimates of the directional vectors obtained via blind identification, i.e. without knowing the array manifold, beamforming is made robust with respect to array deformations, distortion of the wave front, pointing errors etc., so that neither array calibration nor physical modelling is necessary. Rather suprisingly, ‘blind beamformers’ may outperform ‘informed beamformers’ in a plausible range of parameters, even when the array is perfectly known to the informed beamformer. The key assumption on which blind identification relies is the statistical independence of the sources, which is exploited using fourth-order cumulants. A computationally efficient technique is presented for the blind estimation of directional vectors, based on joint diagonalisation of fourth-order cumulant matrices; its implementation is described, and its performance is investigated by numerical experiments.

...read moreread less

2,851 citations

Journal Article•DOI•

A blind source separation technique using second-order statistics

[...]

Adel Belouchrani¹, Karim Abed-Meraim², Jean-François Cardoso³, Eric Moulines³•Institutions (3)

Villanova University¹, University of Melbourne², École Normale Supérieure³

01 Feb 1997-IEEE Transactions on Signal Processing

TL;DR: A new source separation technique exploiting the time coherence of the source signals is introduced, which relies only on stationary second-order statistics that are based on a joint diagonalization of a set of covariance matrices.

...read moreread less

Abstract: Separation of sources consists of recovering a set of signals of which only instantaneous linear mixtures are observed. In many situations, no a priori information on the mixing matrix is available: The linear mixture should be "blindly" processed. This typically occurs in narrowband array processing applications when the array manifold is unknown or distorted. This paper introduces a new source separation technique exploiting the time coherence of the source signals. In contrast with other previously reported techniques, the proposed approach relies only on stationary second-order statistics that are based on a joint diagonalization of a set of covariance matrices. Asymptotic performance analysis of this method is carried out; some numerical simulations are provided to illustrate the effectiveness of the proposed method.

...read moreread less

2,721 citations