scispace - formally typeset
Search or ask a question

Showing papers on "Principal component analysis published in 1982"


Journal ArticleDOI
TL;DR: A simple linear neuron model with constrained Hebbian-type synaptic modification is analyzed and a new class of unconstrained learning rules is derived.
Abstract: A simple linear neuron model with constrained Hebbian-type synaptic modification is analyzed and a new class of unconstrained learning rules is derived. It is shown that the model neuron tends to extract the principal component from a stationary input vector sequence.

2,405 citations


Journal ArticleDOI
TL;DR: The use of principal components in regression has received a lot of attention in the literature in the past few years, and the topic is now beginning to appear in textbooks as discussed by the authors, however, there appears to have been a growth in the misconception that the principal components with small eigenvalues will very rarely be of any use in regression.
Abstract: The use of principal components in regression has received a lot of attention in the literature in the past few years, and the topic is now beginning to appear in textbooks. Along with the use of principal component regression there appears to have been a growth in the misconception that the principal components with small eigenvalues will very rarely be of any use in regression. The purpose of this note is to demonstrate that these components can be as important as those with large variance. This is illustrated with four examples, three of which have already appeared in the literature.

823 citations


Journal ArticleDOI
TL;DR: In this paper, the results of convergence by sampling in linear principal component analysis (LPCA) were derived for a random function in a separable Hilbert space, and the limiting distribution was given for the principal values and the principal factors.

488 citations


Journal ArticleDOI
TL;DR: In this article, a technique is presented for selection of principal components for which the geophysical signal is greater than the level of noise, which is simulated by repeated sampling of principal component computed from a spatially and temporally uncorrected random process.
Abstract: A technique is presented for selection of principal components for which the geophysical signal is greater than the level of noise. The level of noise is simulated by repeated sampling of principal components computed from a spatially and temporally uncorrected random process. By contrasting the application of principal components based upon the covariance matrix and correlation matrix for a given data set of cyclone frequencies, it is shown that the former is more suitable to fitting data and locating the individual variables that represent large variance in the record, while the latter is more suitable for resolving spatial oscillations such as the movement of primary storm tracks.

440 citations


Journal ArticleDOI
TL;DR: The method is based on successively predicting each element in the data matrix after deleting the corresponding row and column of the matrix, and makes use of recently published algorithms for updating a singular value decomposition.
Abstract: A method is described for choosing the number of components to retain in a principal component analysis when the aim is dimensionality reduction. The correspondence between principal component analysis and the singular value decomposition of the data matrix is used. The method is based on successively predicting each element in the data matrix after deleting the corresponding row and column of the matrix, and makes use of recently published algorithms for updating a singular value decomposition. These are very fast, which renders the proposed technique a practicable one for routine data analysis.

364 citations


Journal ArticleDOI
TL;DR: In this paper, an extension of classical data analytic techniques designed for p-variate observations to such data is discussed. But the essential step is the expression of the classical problem in the language of functional analysis, after which the extension to functions is a straightforward matter.
Abstract: A datum is often a continuous functionx(t) of a variable such as time observed over some interval. One or more such functions are observed for each subject or unit of observation. The extension of classical data analytic techniques designed forp-variate observations to such data is discussed. The essential step is the expression of the classical problem in the language of functional analysis, after which the extension to functions is a straightforward matter. A schematic device called the duality diagram is a very useful tool for describing an analysis and for suggesting new possibilities. Least squares approximation, descriptive statistics, principal components analysis, and canonical correlation analysis are discussed within this broader framework.

239 citations


Journal ArticleDOI
TL;DR: In this paper, a discussion in expository form of the use of singular value decomposition in multiple linear regression, with special reference to the problems of collinearity and near-coincurrence, is presented.
Abstract: Principal component analysis, particularly in the form of singular value decomposition, is a useful technique for a number of applications, including the analysis of two-way tables, evaluation of experimental design, empirical fitting of functions, and regression. This paper is a discussion in expository form of the use of singular value decomposition in multiple linear regression, with special reference to the problems of collinearity and near collinearity.

225 citations


Journal ArticleDOI
TL;DR: The general conclusion reached is that the three methods produce results that are equivalent, and several important trends were observed.
Abstract: Factor analysis and component analysis represent two broad classes of methods employed generally with similar types of problems. The purpose of the present study is to determine the extent to which and under what conditions the methods produce different patterns. Principal component analysis, image component analysis, and maximum likelihood factor analysis were performed on simulated data matrices. Comparisons were made between each of the three methods and to ideal patterns. Sample size, saturation, and type of pattern were systematically varied. The general conclusion reached is that the three methods produce results that are equivalent. In addition, several important trends were observed.

177 citations


Journal ArticleDOI
TL;DR: In this paper, the authors provide an example to demonstrate some distorted loading patterns which can result from the direct application of PC analysis (or eigenvector analysis, factor analysis, or asymptotic singular decomposition) on irregularly spaced data.
Abstract: Principal component (PC) analysis performed on irregularly spaced data can produce distorted loading patterns. We provide an example to demonstrate some distorted patterns which can result from the direct application of PC analysis (or eigenvector analysis, factor analysis, or asymptotic singular decomposition) on irregularly spaced data. The PCs overestimate loadings in areas of dense data. The problem can be avoided by interpolating the irregularly spaced data to a grid which closely approximates equal-area.

99 citations


Book ChapterDOI
Jan de Leeuw1
01 Jan 1982
TL;DR: In this paper, an alternative algorithm for nonlinear principal component analysis which combines features of both previous approaches is proposed, which is called multiple correspondence analysis (MCA) algorithm, which combines the features of the two approaches.
Abstract: Two quite different forms of nonlinear principal component analysis have been proposed in the literature. The first one is associated with the names of Guttman, Burt, Hayashi, Benzecri, McDonald, De Leeuw, Hill, Nishisato. We call it multiple correspondence analysis. The second form has been discussed by Kruskal, Shepard, Roskam, Takane, Young, De Leeuw, Winsberg, Ramsay. We call it nonmetric principal component analysis. The two forms have been related and combined, both geometrically and computationally, by Albert Gifi. In this paper we discuss the relationships in more detail, and propose an alternative algorithm for nonlinear principal component analysis which combines features of both previous approaches.

91 citations


Journal ArticleDOI
TL;DR: In this article, correspondence analysis is used as a statistical tool of the data reduction type, useful for abundance data, applied to three research examples from the archaeology of northern Norway.
Abstract: Correspondence analysis is put forward as a statistical tool of the data reduction type, useful for abundance data. The method is applied to three research examples from the archaeology of northern Norway. It is pointed out that the method can be used as a device for automatic seriation.

Journal ArticleDOI
TL;DR: In this paper, a series of multivariate methods have been compared to assess their effectiveness in extracting essential information out of a complex micropaleontological data-set, which consists of relative frequencies (percentages) of Miocene coccolith taxa or groups of taxa in cores of the Deep Sea Drilling Project (DSDP) from the Atlantic Ocean.

Journal ArticleDOI
David C Hagen1
TL;DR: In this article, principal component correlation coefficients were found to provide an accurate method of grouping the traces in both the supervised and unsupervised modes, and one or more well logs are available, then their geographical locations relative to the seismic data can be used to initialize the cluster centers, to which other traces are added as appropriate.

Journal ArticleDOI
TL;DR: In this article, an alternative approach to canonical correlation, based on a general linear multivariate model, is presented, and principal component analysis (PCA) is used to help explain the method.
Abstract: Canonical correlation has been little used and little understood, even by otherwise sophisticated analysts. An alternative approach to canonical correlation, based on a general linear multivariate model, is presented. Properties of principal component analysis are used to help explain the method. Standard computational methods for full rank canonical correlation, techniques for canonical correlation on component scores, and canonical correlation with less than full rank are discussed. They are seen to be essentially equivalent when the model equation for canonical correlation on component scores is presented. The two approaches to less than full rank situations are equivalent in some senses, but quite different in usefulness, depending on the application. An example dataset is analyzed in detail to help demonstrate the conclusions.

Journal ArticleDOI
TL;DR: Nest-site characteristics of nine bird species breeding in high densities in the dune-ridge forest at Delta Marsh, Manitoba, were analyzed using multivariate techniques and identified three distinct groups of species, based primarily on vertical stratification.
Abstract: Nest-site characteristics of nine bird species breeding in high densities in the dune-ridge forest at Delta Marsh, Manitoba, were analyzed using multivariate techniques. Varimax-rotated principal component analysis of the entire set of nest-site variables suggested partitioning of the data into nest-habitat and nest-tree subsets. Discriminant analysis of nest-habitat variables confirmed the ambiguous nature of species relationships in the factor analysis. Discriminant analysis of nest-tree variables identified three distinct groups of species, based primarily on vertical stratification. The existence of these groups and their memberships were supported by similar results derived from discriminant analysis of the entire nest-site data set. Within these groups, pairs of species showed sufficient similarity in nest sites to warrant detailed investigation.

Journal ArticleDOI
TL;DR: It is shown that singular value decomposition (s.v.d.) is an excellent tool for studying the limit properties of a feasible solution for the inverse problem in electrocardiography and leads to a noise filtering algorithm, which at the same time results in useful data reduction.
Abstract: In the paper it is shown that singular value decomposition (s.v.d.) is an excellent tool for studying the limit properties of a feasible solution for the inverse problem in electrocardiography. When s.v.d. is applied to the transfer matrix, relating equivalent heart sources to the skin potentials, it provides a measure of the observability. In an example presented, a series of orthonormal potential patterns on a pericardial surface are found in an order of decreasing observability. When s.v.d. is applied to a data matrix, consisting of skin potentials as a function of time and position, one finds the normalised principal components both in time and space. An appropriate use of the singular values leads to a noise filtering algorithm, which at the same time results in useful data reduction. Comparison of spatial potential patterns derived from both the transfer matrix and the data matrix may, finally, be used to evaluate the assumptions on the transfer.

Journal ArticleDOI
TL;DR: This pattern recognition algorithm is verified using multi-sensor imagery, and the results are found to compare favorably to those obtained using other candidate techniques.
Abstract: Concepts, measures, and models of image quality are shown to be quite important in pattern recognition applications. Pattern recognition of imagery subjected to geometrical differences (such as scale and rotational changes) and intensity differences (such as arise in multispectral imagery) are considered. After modeling these image differences as a stochastic process, the optimal filter is derived. This filter is shown to be the principal component of the data. This pattern recognition algorithm is verified using multi-sensor imagery, and the results are found to compare favorably to those obtained using other candidate techniques.

Journal ArticleDOI
TL;DR: In this paper, principal components analysis and trend surface analysis have been applied to a transition mire with the aim to characterize the vegetation pattern and reveal the major trends of variation, and the first three PCA axes were ecologically interpretable, viz. the 1 st and 2nd as a complex soil moisture gradient and the 3rd axis as a gradient in the amount of peat in the soil.
Abstract: Principal components analysis and trend surface analysis have been applied to a transition mire with the aim to characterize the vegetation pattern and reveal the major trends of variation. The first three PCA axes were ecologically interpretable, viz. the 1 st and 2nd as a complex soil moisture gradient and the 3rd axis as a gradient in the amount of peat in the soil. The ecological interpretability of the 1st axis of PCA after VARIMAX rotation, is unclear because some outlier samples caused a reorientation of the axis. TSA appeared to be useful for the clarification of joint patterns of species groups, which were major contributors to ordination axes in terms of component loadings. The smooth effect of TSA was briefly discussed in connection with the influence of extremes upon the outcoming trend structure. The use of four-variable TSA including a time series is emphasized for the study of spatial-temporal relations and ecological succession.

Journal ArticleDOI
TL;DR: The principal component analysis of the total Doppler signal was statistically superior to the A/B ratio in separating the two groups examined in this study.
Abstract: Principal component factor analysis, a mathematical feature extraction technique, has been used to analyse the total information contained in the Doppler signal. In this study two patient groups have been investigated, normals and stenoses of less than 50%. The patients have been classified according to angiographic findings (patients with hypertension, migrane, heart disease, etc. havenot been excluded). The results from the principal component analysis technique have been compared with the more familiar A/B ratio based on the maximum frequency enevelope. Of the 25 normal vessel segments 20 were classified as normal by the A/B ratio technique and 22 by the principal component technique, while of the 19 abnormal vessels 13 were classified as abnormal by the A/B technique and 17 by the principal component analysis. Also the principal component analysis of the total Doppler signal was statistically superior to the A/B ratio in separating the two groups examined in this study.

Journal ArticleDOI
TL;DR: Combining the use of the M-estimator robust regression procedure and the robust Mahalanobis distance procedure with principal components analysis is demonstrated to be a general method of outlier detection.
Abstract: Because the eight largest bank failures in United States history have occurred since 1973 [24], the development of early-warning problem-bank identification models is an important undertaking. It has been shown previously [3] [5] that M-estimator robust regression provides such a model. The present paper develops a similar model for the multivariate case using both a robustified Mahalanobis distance analysis [21] and principal components analysis [10]. In addition to providing a successful presumptive problem-bank identification model, combining the use of the M-estimator robust regression procedure and the robust Mahalanobis distance procedure with principal components analysis is also demonstrated to be a general method of outlier detection. The results from using these procedures are compared to some previously suggested procedures, and general conclusions are drawn.

Journal ArticleDOI
TL;DR: A method of weighting the variables is suggested which is a part of the classification procedure and thus guarantees an improvement of the cluster clarity and is compared with such popular weighting procedures as equal variance and Mahalanobis distance.

Journal ArticleDOI
TL;DR: In this paper, an objective multivariate analysis technique is used to establish between area uniformity in order to justify the use of substitution of space for time as a valid inferential approach to studying plant succession.

Journal ArticleDOI
TL;DR: In this article, the techniques of principal component analysis (PCA) and subsequent regression analysis were used in an attempt to describe local and upwind chemical and physical factors which affect the variability of SO/sub 4/sup -2/ concentrations observed in a rural area of the northeastern U.S.
Abstract: The techniques of Principal Component Analysis (PCA) and subsequent regression analysis were used in an attempt to describe local and upwind chemical and physical factors which affect the variability of SO/sub 4//sup -2/ concentrations observed in a rural area of the northeastern U.S. The data used in the analyses included up wind and local O/sub 3/ concentrations, temperature, relative humidity and other climatological information, SO/sub 2/, and meterological information associated with backward trajectories. The investigation identified five principal components, three major (eigenvalues >1) and two minor (eigenvalues

Journal ArticleDOI
TL;DR: Results of this preliminary investigation confirm the usefulness of the principal component analysis in a qualitative presentation of the multi-band data and its association with a significant reduction in dimensionality.
Abstract: A Landsat multispectral image was combined with the corresponding digital terrain elevation data to study several information extraction procedures. Principal component and limited multispectral classification procedures were conducted on 1024 × 1024 four-band Landsat and five-band (Landsat plus terrain data) images, and color composites as well as quantitative information were generated. Selected results of this preliminary investigation confirm the usefulness of the principal component analysis in a qualitative presentation of the multi-band data and its association with a significant reduction in dimensionality. However, unlike some other investigators, we found that the full dimensionality must be retained when the information content of the data has to be preserved quantitatively.

Journal ArticleDOI
TL;DR: A Geometric Approach to Principal Component Analysis The American Statistician: Vol 36, No 4, Vol 4, pp 365-367, was presented by.
Abstract: (1982) A Geometric Approach to Principal Components Analysis The American Statistician: Vol 36, No 4, pp 365-367

Journal ArticleDOI
TL;DR: In this article, the quantitative dermatoglyphic traits of the Taimir aborigines have been studied and the correlation matrix of the traits was analyzed by a nonmetric two-dimensional scaling method and by a principal components method.
Abstract: The quantitative dermatoglyphic traits of the Taimir aborigines have been studied in this paper. The correlation matrix of the traits was analyzed by a nonmetric two-dimensional scaling method and by a principal components method. Comparative contribution of palmar and digital traits variation to the principal components is under discussion.

Journal Article
TL;DR: In this paper, a new method which determines incident directions of seismic rays from three component data recorded at one seismic station is presented, which is based on the principal component analysis and has the following characteristics; it conserves phase relation among three components, it may improve SIN ratio in the course of calculations, it can express the variance of the directions of phases composing a P wavelet, and it is adequate to process digitized data semi-automatically.
Abstract: A new method which determines incident directions of seismic rays from three component data recorded at one seismic station is presented. This method is based on the principal component analysis and has the following characteristics; it conserves phase relation among three components, it may improve SIN ratio in the course of calculations, it can express the variance of the directions of phases composing a P wavelet, and it is adequate to process digitized data semi-automatically. Applying this method to data obtained by the seismic array system of the Research Center for Earthquake Prediction, Hokkaido University (RCEP), the following results are obtained. (1) The observation system has inherently some causes of error and the amount of error in direction is several degrees. (2) This method gives fairly stable solutions regardless of amplitudes of seismic waves. (3) Underground structure can be deduced by comparing directions obtained by this method and that calculated from locations of earthquakes, and the differences between them are large compared to the errors in (1). (4) Inhomogeneity of the crust may be expressed by the rectilinearity. (5) Local seismic activity can be monitored. From (3) and (4), it is deduced that, for the Hokkaido region. the structure under the Hidaka mountain range is more inhomogeneous than that of the other regions of Hokkaido and P wave velocity of the land side crust or the upper mantle is lower than that of the ocean side.


Book ChapterDOI
R. C. Tabony1
01 Jan 1982
TL;DR: In this article, the authors examined various methods of estimating missing values in highly correlated climatological data and concluded that principal component analysis (PCA) is the best statistical tool to estimate missing values among highly correlated data.
Abstract: Various methods of estimating missing values in highly correlated climatological data are examined. Any generalised method is likely to be based on a correlation matrix, but the incompleteness of the data introduces problems with this approach. These are illustrated by program BMDPAM of the BMDP suite, which produces estimates worse than those using traditional methods based on single station comparisons. Principal component analysis is considered likely to be the best statistical tool for estimating missing values among highly correlated data. The high quality correlation matrix recuired as input can be obtained by using a simple estimating procedure to produce a preliminary set of complete data. A simple technique was devised for climatological data in the UK, and was found to give estimates similar to those obtained from an eigenvector scheme used for quality control purposes.

Proceedings ArticleDOI
01 May 1982
TL;DR: This mechanism is shown to involve principal component (or Loeve-Karhunen) analysis as an intermediate step in the complete canonical coordinate determination process, and can lead to a substantial simplification in the computational complexity that is entailed in handling a class of non-euclidean error criteria.
Abstract: This paper describes the use of a canonical signal compression and modelling technique that permits the minimization of certain non-euclidean types of signal resynthesis error criteria. The technique is based on the construction of a non-orthogonal transformation from the original sampled signal representation, in either the time, frequency or spatial domain, to a special canonical coordinate domain. The parameters characterizing this transformation are then chosen to minimize the specified error criterion, for each level of truncation of the canonical coordinate based signal representation. A mechanism for factoring this canonical coordinate transformation into an eigenvector-eigenvalue based correlation simplification process and an error metric simplification process, is described. This mechanism is shown to involve principal component (or Loeve-Karhunen) analysis as an intermediate step in the complete canonical coordinate determination process, and can lead to a substantial simplification in the computational complexity that is entailed in handling a class of non-euclidean error criteria.