scispace - formally typeset
Search or ask a question

Showing papers on "Principal component analysis published in 1981"


Journal ArticleDOI
TL;DR: The model presented here calculates a linear combination of variables that quantifies shape differences among populations, independent of size, and construe size and shape not as measured variables, but as general factors, linear combinations most parsimoniously accounting for the associations among the distance measures.
Abstract: Humphries, J. M., F. L. Bookstein, B. Chernoff, G. R. Smith, R. L. Elder, and S. G. Poss (Museum of Zoology, Centerfor Human Growth and Development, Museum of Paleontology, and Division of Biological Sciences, The University of Michigan, Ann Arbor, Michigan 48109) 1981. Multivariate discrimination by shape in relation to size. Syst. Zool., 30:291-308.-The diverse methods for analyzing size-free shape differences tend to be guided by computational expediency rather than geometric principles. We question the use of ratios and ad hoc combinations of spatially unrelated measures. Neither are linear discriminant functions or series of independent regressions helpful to the visualization of shape differences. A bridge is needed between traditional quantitative methods and the geometrical analysis of shape. In principle any measured transects between landmarks of a form can serve as characters in a morphometric analysis. Systematic studies use a highly non-random sample of these, particularly biased regarding geometrical information. We suggest defining size and shape in terms of factors-estimates of information common to a universe of measured distances. The model presented here calculates a linear combination of variables that quantifies shape differences among populations, independent of size. In analyses in which the first two principal components confound size and shape, size is removed from one axis with shear coefficients derived from regression of general size on principal components centered by group. The general size factor is estimated by the principal axis of the within-group covariance matrix of the log-transformed data. Residuals from the regression of general size on the transformed axes approximate a shape-discriminating factor that is uncorrelated with size within group and displays the interpopulation shape differences borne by the first two principal components. The results bear a direct and interpretable correspondence to biorthogonal analysis of shape difference. [Multivariate analysis; principal components; discriminant functions; morphometrics; size-free shape; allometry; fishes.] Systematists need procedures that allow them to discriminate among groups of organisms that vary in size. The groups included in a study can be chosen a priori (e.g., several species or geographic populations within a species) or a posteriori (as a conclusion resulting from some method of analysis). However the groups are chosen, it has long been considered desirable to discriminate among them on the basis of size-free shape derived from distance measures. The terms shape and size have been used in various and sometimes conflicting ways (Huxley, 1932; Thompson, 1942; Simpson, Roe and Lewontin, 1960; Gould, 1966; Mosimann, 1970; Sprent, 1972; Bookstein, 1978). We construe size and shape not as measured variables, but as general factors, linear combinations most parsimoniously accounting for the associations among the distance measures. Size, in particular, is not a single variable such as biomass or a standard length, but a factor which, when called upon to predict all the distance measures within a population, leaves the smallest mean squared residual. We prefer a factor whose algebraic form acknowledges the allometric relationship (Jolicoeur, 1963). Our shape discriminators need to be independent of size (Flessa and Bray, 1977; Mosimann and James, 1979) in order to partition out the effects of growth (e.g., individuals of differing age and size). In general, shape can be defined as the geometry of the organism after "information about position, scale, and orientation" has been removed (Bookstein, 1978:8). There is then an endless variety of shape information remaining. While the quantification of size as a general factor de-

492 citations


Journal ArticleDOI
TL;DR: In this paper, the principal components derived by Wallace and Gutzler (1981) from a 500 mb height data set are linearly transformed using the varimax method, which emphasizes the strongest relationships within the 500mb height dataset.
Abstract: The principal components derived by Wallace and Gutzler (1981) from a 500 mb height data set are linearly transformed using the varimax method. Their data set consists of 45 winter months of National Meteorological Center analyses of Northern Hemisphere 500 mb height. The linear transformation (or rotation) of the principal components emphasizes the strongest relationships within the 500 mb height data set; hence, spatial patterns associated with the rotated principal components are simpler to interpret than the spatial patterns associated with the unrotated components. The teleconnection patterns identified by Wallace and Gutzler (1981) on the basis of the negative extrema approach closely resemble several of the spatial patterns of the rotated principal components. In order to show the seasonal dependence of the rotated principal components, an expanded data set consisting of 30 years of 500 mb height data is used. Most of the teleconnection patterns derived from the 90 winter month data set ar...

384 citations


Journal ArticleDOI
TL;DR: In this paper, Monte Carlo methods are used to compare the performance of several robust procedures for estimating a correlation matrix and its principal components, including a near singularity, and the M-estimators can break down when the dimensionality is large and the outliers are asymmetric.
Abstract: This paper uses Monte Carlo methods to compare the performances of several robust procedures for estimating a correlation matrix and its principal components. The estimators are formed either from separate bivariate analyses or by simultaneous manipulation of all variables by using techniques such as multivariate trimming and M-estimation. The M-estimators stand up exceptionally well. They and the multivariate trimming procedure are especially effective at estimating the principal components, including a near singularity. However, the M-estimators can break down relatively easily when the dimensionality is large and the outliers are asymmetric. With missing data, the element-wise approach becomes more attractive.

355 citations


Journal ArticleDOI
TL;DR: This paper presented an overview of an approach to the quantitative analysis of qualitative data with theoretical and methodological explanations of the two cornerstones of the approach, Alternating Least Squares and Optimal Scaling.
Abstract: This paper presents an overview of an approach to the quantitative analysis of qualitative data with theoretical and methodological explanations of the two cornerstones of the approach, Alternating Least Squares and Optimal Scaling. Using these two principles, my colleagues and I have extended a variety of analysis procedures originally proposed for quantitative (interval or ratio) data to qualitative (nominal or ordinal) data, including additivity analysis and analysis of variance; multiple and canonical regression; principal components; common factor and three mode factor analysis; and multidimensional scaling. The approach has two advantages: (a) If a least squares procedure is known for analyzing quantitative data, it can be extended to qualitative data; and (b) the resulting algorithm will be convergent. Three completely worked through examples of the additivity analysis procedure and the steps involved in the regression procedures are presented.

302 citations


Journal ArticleDOI
TL;DR: It is shown that in the context of multivariate statistical analysis and statistical pattern recognition the three transforms are very similar if a specific estimate of the column covariance matrix is used.

282 citations


Journal ArticleDOI
TL;DR: In this article, the geometry of canonical variate analysis is described as a two-stage orthogonal rotation, where the first stage involves a principal component analysis of the original variables.
Abstract: The geometry of canonical variate analysis is described as a two-stage orthogonal rotation. The first stage involves a principal component analysis of the original variables. The second stage involves a principal component analysis of the group means for the orthonormal variables from the first-stage eigenanalysis. The geometry of principal component analysis is also outlined. Algebraic aspects of canonical variate analysis are discussed and these are related to the geometrical description. Some practical implications of the geometrical approach for stability of the canonical vectors and variable selection are presented. [Multivariate analysis; canonical variate analysis; discriminant analysis; principal component analysis.]

277 citations


Journal ArticleDOI
TL;DR: In this article, principal components and factor analysis are two techniques that are finding increasing application among quality engineers who are concerned with processes with more than one response variable, and the discussion is presented.
Abstract: Principal components and factor analysis are two techniques that are finding increasing application among quality engineers who are concerned with processes with more than one response variable. In this second part of a three-part series, the discussion..

79 citations


Journal ArticleDOI
TL;DR: Six estimation procedures are compared and if the assumption of equal variance is relaxed, the methods based on the sample correlation matrix perform better although others are surprisingly robust.
Abstract: Analysis of variance and principal components methods have been suggested for estimating repeatability. In this study, six estimation procedures are compared: ANOVA, principal components based on the sample covariance matrix and also on the sample correlation matrix, a related multivariate method (structural analysis) based on the sample covariance matrix and also on the sample correlation matrix, and maximum likelihood estimation. A simulation study indicates that when the standard linear model assumptions are met, the estimators are quite similar except when the repeatability is small. Overall, maximum likelihood appears the preferred method. If the assumption of equal variance is relaxed, the methods based on the sample correlation matrix perform better although others are surprisingly robust. The structural analysis method (with sample correlation matrix) appears to be best.

67 citations


Journal ArticleDOI
TL;DR: In this paper, influence functions for a variety of parametric functions in multivariate analysis are obtained, including the generalized variance, the matrix of regression coefficients, the noncentrality matrix Σ-1 δ, and the matrix L, which is a generalization of 1-R2, canonical correlations, principal components and parameters that correspond to Pillai's statistic (1955), Hotelling's (1951) generalized To2 and Wilk's Λ (1932).
Abstract: The influence function introduced by Hampe1 (1968, 1973, 1974) is a tool that can be used for outlier detection. Campbell (1978) has obtained influence function for Mahalanobis’s distance between two populations which can be used for detecting outliers in discrim-inant analysis. In this paper influence functions for a variety of parametric functions in multivariate analysis are obtained. Influence functions for the generalized variance, the matrix of regression coefficients, the noncentrality matrix Σ-1 δ in multivariate analysis of variance and its eigen values, the matrix L, which is a generalization of 1-R2 , canonical correlations, principal components and parameters that correspond to Pillai’s statistic (1955), Hotelling’s (1951) generalized To2 and Wilk’s Λ (1932), which can be used for outlier detection in multivariate analysis, are obtained. Delvin, Ginanadesikan and Kettenring (1975) have obtained influence function for the population correlation co-efficient in the bivariate case. It is shown in...

60 citations


Journal ArticleDOI
R. C. Tabony1
TL;DR: In this paper, the main patterns of European rainfall anomalies were obtained from a principal component analysis of 182 homogeneous rainfall series from 1861 to 1970, and the most important component corresponded to an anomaly of the same sign and magnitude covering most of the area examined.
Abstract: The main patterns of European rainfall anomalies were obtained from a principal component analysis of 182 homogeneous rainfall series from 1861 to 1970. The most important component corresponded to an anomaly of the same sign and magnitude covering most of the area examined. The principal components were essentially the same for all seasons and limited networks of stations back to 1786 showed that they were stable with lime. Spectral analysis of the patterns revealed cycles in European rainfall at periods of 2.4 years in summer and 2.1 and 5 years in the winter half year. The first of these is the most likely to represent a permanent feature of the atmosphere.

57 citations


Journal ArticleDOI
TL;DR: This article conducted an empirical investigation of the properties of lines fitted by eye and found that students tended to choose slopes near that of the first principal component (major axis) of the data, and their lines passed close to the centroid.
Abstract: Because little is known about properties of lines fitted by eye, we designed and carried out an empirical investigation. Inexperienced graduate and postdoctoral students instructed to locate a line for estimating y from x for four sets of points tended to choose slopes near that of the first principal component (major axis) of the data, and their lines passed close to the centroids. Students had a slight tendency to choose consistently either steeper or shallower slopes for all sets of data.

Journal ArticleDOI
TL;DR: Criteria indicate that principal component analysis performs best, but inaccurately, with variable standardization and the correlation matrix, and that nonlinear mapping consistently performs poorly, while performance of methods is consistent with the nonlinearity of Abronia data.
Abstract: Pimentel, R. A. (Department of Biological Sciences, California Polytechnic State Univ., San Luis Obispo, California) 1981. A comparative study of data and ordination techniques based on a hybrid swarm of sand verbenas (Abronia Juss.). Syst. Zool., 30:250-267.-The influence of kinds of variables, data errors, standardizations similarity coefficients, and ordination techniques are judged in reference to a model involving hybridization and introgression in three species of sand verbenas, Abronia (Juss.). Various criteria indicate that principal component analysis performs best, but inaccurately, with variable standardization and the correlation matrix, and that nonlinear mapping consistently performs poorly. Excellent results were obtained from principal coordinate analysis with Gower's general similarity coefficient based upon quantitative and multistate variables; a 'diagnostic' character measured with known error; and detailed color evaluation. This and other principal coordinate analysis results were further improved by nonmetric multidimensional scaling. Performance of methods is consistent with the nonlinearity of Abronia data. It is assumed that nonlinearity is common in taxonomic data. [Multivariate analysis; multivariate analysis assumptions; discriminant analysis; classification methods; principal component analysis; principal coQrdinate analysis; nonlinear mapping; nonmetric multidimensional scaling; ordination comparisons; data properties; nonlinearity; sand verbenas, Abronia.]

Journal ArticleDOI
TL;DR: In this article, a robust principal component analysis for samples from a bivariate distribution function is described, which is based on robust estimators for dispersion in the univariate case along with a certain linearization of the bivariate structure.

Journal ArticleDOI
TL;DR: In this paper, redundancy analysis and multivariate multiple linear regression (MMLR) were investigated for canonical correlation and orthogonal rotation of the components, and the solution was shown to be identical to van den Wollenberg's maximum redundancy solution.
Abstract: This paper attempts to clarify the nature of redundancy analysis and its relationships to canonical correlation and multivariate multiple linear regression. Stewart and Love introduced redundancy analysis to provide non-symmetric measures of the dependence of one set of variables on the other, as channeled through the canonical variates. Van den Wollenberg derived sets of variates which directly maximize the between set redundancy. Multivariate multiple linear regression on component scores (such as principal components) is considered. The problem is extended to include an orthogonal rotation of the components. The solution is shown to be identical to van den Wollenberg's maximum redundancy solution.

Journal ArticleDOI
TL;DR: In this article, principal components and factor analysis are two techniques that are finding increasing application among quality engineers who are concerned with processes with more than one response variable in this third and concluding part of a three-part series.
Abstract: Principal components and factor analysis are two techniques that are finding increasing application among quality engineers who are concerned with processes with more than one response variable In this third and concluding part of a three-part series,

Journal ArticleDOI
TL;DR: In this article, the authors examined temporal changes in the cranial architecture of Arikara Amerindians from five archaeological sites in South Dakota which span a time period of approximately 230 years (ca. A.D. 1600-1830).
Abstract: We have examined temporal changes in the cranial architecture of Arikara Amerindians from five archaeological sites in South Dakota which span a time period of approximately 230 years (ca. A.D. 1600–1830). We have utilized a multivariate statistical method based on a principal components analysis of the pooled within-groups correlation matrix rather than the more traditional methods of ascertaining morphological relationships, e.g., discriminant functions, Mahalanobis' D2, or Penrose's Size and Shape. Our component structure, based on a regional sample and the mathematically simpler principal components analysis, is very similar to the factor structure obtained by Howells (1973) using a world-wide sample and factor analysis proper. This supports the notion of the “universality” of cranial structure. An axis of temporal variation was introduced into the component space by means of multiple regression. This analysis indicates that a substantial portion of the intergroup variation is temporal in nature and that systematic temporal changes occur along the facial height, transverse frontal flatness, and frontal profile flatness components. Earlier analyses of the same material by more conventional methods either did not detect the temporal trends at all or failed to isolate the specific nature of the temporal changes. The success of the present analysis attests to the value of examining morphological relationships by means of principal components.

Book ChapterDOI
01 Jan 1981
TL;DR: In this paper, an alternative approach to factor analysis, Target Transformation Factor Analysis, is introduced and its application to a subset of particle composition data from the Regional Air Pollution Study (RAPS) of St. Louis, Missouri is presented.
Abstract: Among the multivariate statistical techniques that have been used as source-receptor models, factor analysis is the most widely employed. The basic objective of factor analysis is to allow the variation within a set of data to determine the number of independent causalities, i.e. sources of particles. It also permits the combination of the measured variables into new axes for the system that can be related to specific particle sources. The principles of factor analysis are reviewed and the principal components method is illustrated by the reanalysis of aerosol composition results from Charleston, West Virginia. An alternative approach to factor analysis, Target Transformation Factor Analysis, is introduced and its application to a subset of particle composition data from the Regional Air Pollution Study (RAPS) of St. Louis, Missouri is presented.

Journal ArticleDOI
TL;DR: There is evidence favoring the varimaxed solution of the PCA for the parametrization of most of the experimental data and an additional intermediate component that seems to reduce the between-subject variance.

Journal ArticleDOI
TL;DR: In general, disproportionately heavy sampling of the ends of a gradient increases the interpretability of eigenvector ordinations as mentioned in this paper, and in particular, correspondence analysis and detrended correspondence analysis (DCA) best reproduce the original positions of samples in simulated coenoclines when samples are clustered toward the end of the axis.
Abstract: In general, disproportionately heavy sampling of the ends of a gradient increases the interpretability of eigenvector ordinations. More specifically, correspondence analysis (CA) and detrended correspondence analysis (DCA) best reproduce the original positions of samples in simulated coenoclines when samples are clustered toward the ends of the axis. Principal components analysis (PCA) reproduces the original sample positions less well than either CA or DCA and shows no improvement as samples are increasingly clustered toward the ends of the axis. PCA and CA show less curvature of one dimensional data into the second axis when sampling favors the ends of the axis.

Journal ArticleDOI
TL;DR: In this paper, the spatial distributions of point defects resulting from collision cascades in solids are analyzed and compared in a simulation of hundreds of cascades generated by projectiles in the keV energy range incident on polycrystalline gold.

Journal Article
TL;DR: A practical example is given in which principal components transformation revealed the presence of subpopulations in a four-dimensional data set.
Abstract: Principal components transformation may be used to explore the structure of a p-dimensional data set. It is difficult to detect inhomogeneities in a data set of multivariate variables by mere visual inspection of the numerical data. Plotting each variable's distribution is often either impractical, due to the number of variables involved, or might fail to reveal the presence of subpopulations due to high correlations. A practical example is given in which principal components transformation revealed the presence of subpopulations in a four-dimensional data set.

Journal Article
TL;DR: An iterative procedure, the so-called power method, for finding a multivariate distribution's eigenvectors and eigenvalues is demonstrated and the projection of feature vectors onto the principal components is shown.
Abstract: The principal components transformation offers an effective methods for dimensionality reduction and for the assessment of the mutual dependence of observed variables in a data set. An iterative procedure, the so-called power method, for finding a multivariate distribution's eigenvectors and eigenvalues is demonstrated. The projection of feature vectors onto the principal components is shown.

Journal ArticleDOI
TL;DR: In this article, the understorey vegetation of nine localities with different fire histories from open eucalypt forest near Melbourne, Victoria, was analyzed by principal component analysis.
Abstract: The understorey vegetation of nine localities with different fire histories from open eucalypt forest near Melbourne, Victoria, was analysed by principal component analysis. Floristically, localities were quite similar; however, structural differences caused mainly by different burning regimes of recent years were more evident. An analysis of presence-absence data displayed a marked discontinuity that was explainable in terms of the timing and intensity of a recent fire. Using unstandardized height data the pattern was related to inter- and intra-locality differences in time since the last fires. An analysis of standardized height data demonstrated a connection between understorey structure and fire frequency. Despite apparent differences in the scatter diagrams obtained a statistical comparison of the analytical results indicated that, in many respects, the ordinations were similar

Journal ArticleDOI
TL;DR: In this paper, the authors used principal component (eigenvector) analysis of Holocene paleoclimatic data to estimate the July temperature departures at an unknown site (Long Lake, Keewatin).

01 Jan 1981
TL;DR: In this article, the development of some methods for ordinally measurable data and its implications for regional statistical and econometric analyses are discussed, such as multiple regression analysis, clustering and classification, principal component analysis, and partial least squares.
Abstract: The essential feature of multivariate methods is that they aim at reducing the complexity of phenomena in which many variables or attributes are involved. Given this general feature , it is no surprise to see that these methods have been applied in various fields of research, such as economics, geography, medicine, biology, etc. This paper will be devoted to the development of some methods for ordinally measurable data and its implications for regional statistical and econometric analyses. We will deal with the following methods: multiple regression analysis (and related subjects such as interdependence analysis and discriminant analysis); clustering and classification; principal component analysis (and related subjects such as canonical correlations and partial least squares).

Journal ArticleDOI
TL;DR: In this paper, simple modifications of principal component methods are described that have distinct advantages for structural analysis of relations among educational and psychological variables, including the provision for the incorporation of prior beliefs about errors in the variables, computational efficiency, tractability for large battery analysis, and the availability of hypothesis testing procedures.
Abstract: Simple modifications of principal component methods are described that have distinct advantages for structural analysis of relations among educational and psychological variables. Advantages include the provision for the incorporation of prior beliefs about errors in the variables, computational efficiency, tractability for large battery analysis, and the availability of hypothesis testing procedures. The methods are contrasted theoretically and empirically with conventional principal component methods and with maximum likelihood factor analysis.

Journal ArticleDOI
TL;DR: In this article, a multivariate statistical technique, principal component analysis (PCA), is used to interpret pollen assemblages from archaeological context in terms of paleoenvironmental information.

Journal ArticleDOI
01 Sep 1981
TL;DR: In this paper, the use of principal components analysis to delimit regions with similar secular trends in rainfall totals is reviewed and an alternative method of regionalization is proposed and is applied to a data set previously analysed using principal component analysis.
Abstract: The use of principal components analysis to delimit regions with similar secular trends in rainfall totals is reviewed. An alternative method of regionalization is proposed and is applied to a data set previously analysed using principal components analysis.


Journal ArticleDOI
TL;DR: In this paper, the theories of principal component analysis and nonlinear least-squares projection techniques are outlined and compared, and several applications from various chemical fields are presented which show that a complete analysis of the underlying structure and dimensionality of a chemical data set should always include these nonlinear projection techniques.